Book Eum
Book Eum
Version 1.0
May 7, 2002
Third Edition (Dec 2001)
The following paragraph does not apply to the United Kingdom or any country where such provisions are inconsistent
with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS DOCUMENT “AS IS”
WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Some
states do not allow disclaimer of express or implied warranties in certain transactions; therefore, this statement may
not apply to you. IBM does not warrant that the use of the information herein shall be free from third party intellectual
property claims.
IBM does not warrant that the contents of this document will meet your requirements or that the document is error-free.
Changes are periodically made to the information herein; these changes will be incorporated in new editions of the
document. IBM may make improvements and or changes in the product(s) and/or program(s) described in this
document at any time. This document does not imply a commitment by IBM to supply or make generally available the
product(s) described herein.
No part of this document may be reproduced or distributed in any form or by any means, or stored in a data base or
retrieval system, without the written permission of IBM.
Address comments about this document to:
IBM Corporation
Department B5H / Building 667
3039 Cornwallis Road P.O. Box 12195
Research Triangle Park, NC 27709
Portions of the information in this document may have been published previously in the following related documents:
The PowerPC Architecture: A Specification for a New Family of RISC Processors, Second Edition (1994)
The IBM PowerPC Embedded Environment: Architectural Specifications for IBM PowerPC Embedded Controllers,
Second Edition (1998)
IBM may have patents or pending patent applications covering the subject matter in this document. The furnishing of
this document does not give you any license to these patents. You can send license inquiries, in writing, to the IBM
Director of Licensing, North Castle Drive, Armonk, NY 10504, United States of America.
Copyright International Business Machines Corporation 1993, 2000. All rights reserved.
Printed in the United States of America.
The following terms are trademarks of IBM Corporation:
IBM PowerPC
Other terms which are trademarks are the property of their respective owners.
This release represents the initial release of the Book E architecture specification.
Many thanks to those in Motorola and IBM who have reviewed this document and
contributed so much to cleaning up after my carelessness.
The Editor
07 May 02 Figures xi
xii Book E: Enhanced PowerPC Architecture Version 1.0 07 May 02
Tables
1.1 Overview
32-bit Book E implementations will execute applications that adhere to the soft-
ware guidelines for 32-bit Book E software outlined in Appendix A and are not
The description of each instruction includes the mnemonic and a formatted list of
operands. Some examples are the following.
stw RS,D(RA)
addis RT,RA,SI
1.5.1 Notes
Architecture Note
Used to convey the direction of the architecture definition with respect to
a particular function or feature.
Programming Note
Used to convey recommendations and suggestions to software developers
on how a particular function or feature should be used in an application
or operating system.
Engineering Note
Used to convey information on implementation options or how a particu-
lar feature might be supported. While the primary audience is hardware
developers, software developers should benefit as well.
Compiler Note
Used to convey information to compiler developers how best to support or
denigrate a particular feature that is either being added to, is currently a
part of, or is being evicted from Book E.
Compatibility Note
Used to convey information on compatibility with the PowerPC
Architecture.
Note
Used to convey information on generic, miscellaneous issues.
The following definitions and notation are used throughout the Book E document.
– Ranges of bits are specified by two numbers separated by a colon (:). The
range p:q consists of bits p through q.
• A period (.) as the last character of an instruction mnemonic means that the
instruction records status information in certain fields of the Condition Regis-
ter as a side effect of execution, as described in Chapter 3 through Chapter 5.
• The symbol || is used to describe the concatenation of two values. For exam-
ple, 010 || 111 is the same as 010111.
• nx
means the replication of x, n times (i.e., x concatenated to itself n–1 times).
n0
and n1 are special cases:
n0
– means a field of n bits with each bit equal to 0. Thus 50 is equivalent to
0b00000.
0b11111.
1. Each bit and field in instructions, and in status and control registers (e.g. Integer Exception Register and Floating-Point
Status and Control Register) and other Special Purpose Registers, is either defined, allocated, or reserved. See Sections
1.5.4, 1.5.5, and 1.5.6.
block
The aligned unit of storage operated on by each Cache Management
instruction. The size of a block can vary by instruction and by
implementation. The maximum block size is one page.
boundedly undefined
If the results of executing a given instruction could have been achieved by
executing an arbitrary sequence of instructions, starting in the state the
machine was in before executing the given instruction. Boundedly undefined
results for a given instruction may vary between implementations, and
between different executions on the same implementation, and are not further
defined in this document.
byte
A 8-bit element of storage.
context of a program
The environment (e.g., privilege and relocation) in which the program
executes. That context is controlled by the contents of certain system
registers, such as the Machine State Register, and of the address translation
tables.
data storage
The view of storage as seen by a Storage Access or Cache Management
instruction.
doubleword
A 64-bit element of storage.
exception
An error, unusual condition, or external signal that may set a status bit and
may or may not cause an interrupt, depending upon whether the
corresponding interrupt is enabled.
halfword
A 16-bit element of storage.
hardware
Any combination of hard-wired implementation, emulation assist, or interrupt
for software assistance. In the last case, the interrupt may be to an
architected location or to an implementation-dependent location. Any use of
emulation assists or interrupts to implement the architecture is described in
User’s Manual.
instruction completion
The point in time when the instruction causes no further effect on processor
state, when all results have been recorded in architected state.
instruction fetching
In general, instructions appear to execute sequentially, in the order in which
they appear in storage. The exceptions to this rule are listed below.
– Trap instructions for which the trap conditions are satisfied cause a Trap
exception type Program interrupt to be taken.
Programming Note
If a program modifies the instructions it intends to execute, it should execute the
sequence of instructions listed in Section 6.3.2 on page 139 before attempting to
execute the modified instructions, to ensure that the modifications have taken
effect with respect to instruction fetching.
instruction storage
The view of storage as seen by the mechanism that fetches instructions.
interrupt
The act of changing the machine state in response to an exception, as
described in Section 7 on page 143.
interrupt handler
A component of the system software that receives control when an interrupt
occurs. The interrupt handler includes a component for each of the various
kinds of interrupts. These interrupt-specific components are referred to as the
Alignment interrupt handler, the Data Storage interrupt handler, etc.
latency
Refers to the interval from the time an instruction begins execution until it
produces a result that is available for use by a subsequent instruction.
main storage
The level of the storage hierarchy in which all storage state is visible to all
processors and mechanisms in the system.
negative
Means less than zero.
page
A "power of 2"-aligned unit of storage for which protection and control
attributes are independently specifiable and for which reference and change
status are independently recorded.
performed
A load or instruction fetch by a processor or mechanism (P1) is performed
with respect to any processor or mechanism (P2) when the value to be
returned by the load or instruction fetch can no longer be changed by a store
by P2. A store by P1 is performed with respect to P2 when a load by P2 from
the location accessed by the store will return the value stored (or a value
stored subsequently). An instruction cache block invalidation by P1 is
performed with respect to P2 when an instruction fetch by P2 will not be
satisfied from the copy of the block that existed in its instruction cache when
the instruction causing the invalidation was executed, and similarly for a data
cache block invalidation. The preceding definitions apply regardless of
whether P1 and P2 are the same entity.
positive
Means greater than zero.
processor
A hardware component that executes Book E instructions specified in a
program.
program
A sequence of related instructions.
program order
The execution of instructions in the order required by the sequential
execution model (see below).
quadword
A 128-bit element of storage.
real page
A unit of real storage to which a virtual page is or could be mapped.
Additional exceptions to the rule that the processor obeys the sequential
execution model, beyond those described in ‘instruction fetching’, are the
following.
Engineering Note
Although External and imprecise interrupts must be considered in determining
whether an instruction is required by the sequential execution model, the fact that
these interrupts are not required to be recognized at any specific point in the
instruction stream allows an implementation to halt instruction dispatching and
delay recognition of the interrupt until the processor comes into a state consistent
with the sequential execution model. Such an implementation need not consider
these interrupts in determining whether an instruction is required by the sequential
execution model.
storage access
An access to a storage location caused by executing a Storage Access or Cache
Management instruction (‘data access’) or by fetching an instruction, or an
implicit access that occurs as a side effect of such an access (e.g., to translate
the effective address).
storage location
One or more sequential bytes of storage beginning at the address specified by
a Storage Access or Cache Management instruction or by the instruction
fetching mechanism. The number of bytes comprising the location is based on
the type of instruction being executed, or is four for instruction fetching.
system
A combination of processors, storage, and associated mechanisms that is
capable of executing programs. Sometimes the reference to system includes
services provided by the operating system.
trap interrupt
An interrupt that results from execution of a Trap instruction
unavailable
Refers to a resource that cannot be used by the program. Storage is
unavailable if access to it is denied. Floating-point instructions are
unavailable if use of them is denied. See Section 7.6.8 on page 165.
word
A 32-bit element of storage.
All reserved fields in instructions should be zero. If they are not, the instruction
form is invalid: see Section 1.9.2, "Invalid Instruction Forms", on page 28.
The handling of reserved bits in System Registers (e.g. Integer Exception Register,
Floating-Point Status and Control Register) is implementation-dependent. Soft-
ware is permitted to write any value to such a bit with no visible effect on proces-
sors that implement this version of Book E. A subsequent reading of the bit
returns a 0 if the last value written to the bit was 0 and returns an undefined
value (0 or 1) otherwise.
Engineering Note
Reserved bits in System Registers need not be implemented.
Programming Note
It is the responsibility of software to preserve bits that are now reserved in System Regis-
ters, as they may be assigned a meaning in some future version of the architecture.
1. Initialize each such register supplying zeros for all reserved bits.
2. Alter (defined) bit(s) in the register by reading the register, altering only the desired
bit(s), and then writing the new value back to the register.
The Integer Exception Register and Floating-Point Status and Control Register are par-
tial exceptions to this recommendation. Software can alter the status bits in these
registers, preserving the reserved bits, by executing instructions that have the side effect
of altering the status bits. Similarly, software can alter any defined bit in the Floating-
Point Status and Control Register by executing a Floating-Point Status and Control Regis-
ter instruction. Using such instructions is likely to yield better performance than using
the method described in the second item above.
When a currently reserved bit is subsequently assigned a meaning, every effort will be
made to have the value to which the system initializes the bit correspond to the ‘old
behavior’.
Certain System Registers are defined as 32-bit registers, with their bits numbered
32:63. These 32-bit registers, with the exception of the Floating-Point Status and
Control Register and its unique behavior on Move From FPSCR instructions (see
Section 5.6.7 on page 104), can be treated as 64-bit registers with the upper 32
bits being reserved. However, Book E guarantees that the upper 32 bits of these
registers will remain reserved.
Preserved bits in System Registers are bits that were defined in the PowerPC
Architecture, are not defined in Book E, but are preserved to allow implementa-
tions of Book E to support the legacy definition for software compatibility.
Engineering Note
Preserved bits in System Registers need not be implemented.
Programming Note
Software has the responsibility of maintaining the contents of preserved bits in System
Registers. Preserved bits may be assigned a meaning in some future version of Book E.
1. Initialize each such register supplying zeros for all preserved bits.
2. Alter (defined) bit(s) in the register by reading the register, altering only the desired
bit(s), and then writing the new value back to the register.
Engineering Note
Allocated bits in System Registers need not be implemented.
Architecture Note
Allocated bits are provided to support implementation-dependent extensions to the
Book E.
Programming Note
It is the responsibility of software to preserve bits that are now allocated in System Reg-
isters, as they may be assigned a meaning in some future version of the architecture.
1. Initialize each such register supplying zeros for all allocated bits.
2. Alter (defined) bit(s) in the register by reading the register, altering only the desired
bit(s), and then writing the new value back to the register.
The RTL descriptions cover the normal execution of the instruction, except that
"implicit" setting of the Condition Register, Integer Exception Register, and Float-
ing-Point Status and Control Register, such as to reflect the final status of the
execution of the instruction, is not always shown. (Explicit setting of these regis-
ters, such as the setting of Condition Register Field 0 by the stwcx. instruction, is
shown.) The RTL descriptions do not cover all of the cases in which the interrupt
may be invoked, or for which the results are boundedly undefined, and may not
cover all invalid forms.
Notation Meaning
← Assignment
←f Assignment in which the data may be reformatted in the target location
¬ NOT logical operator (one’s complement)
+ Two's complement addition
– Two's complement subtraction, unary minus
× Multiplication
÷ Division (yielding quotient)
+dp Floating-point addition, result rounded to double-precision
–dp Floating-point subtraction, result rounded to double-precision
×dp Floating-point multiplication, product rounded to double-precision
÷dp Floating-point division, quotient rounded to double-precision
+sp Floating-point addition, result rounded to single-precision
–sp Floating-point subtraction, result rounded to single-precision
×sp Floating-point multiplication, product rounded to single-precision
÷sp Floating-point division, quotient rounded to single-precision
×fp Floating-point multiplication to ‘infinite’ precision (no rounding)
FPSquareRoot-
Floating-point x , result rounded to double-precision
Double(x)
FPSquareRoot-
Floating-point x , result rounded to single-precision
Single(x)
FPReciprocal- 1
Floating-point estimate of ----
Estimate(x) X
FPReciprocal-
1
SquareRoot- Floating-point estimate of -------
Estimate(x) x
Allocate- If the block containing the byte addressed by x does not exist in the data
DataCache- cache, allocate a block in the data cache and set the contents of the block
Block(x) to 0.
Flush- If the block containing the byte addressed by x exists in the data cache
DataCache- and is dirty, the block is written to main storage and is removed from the
Block(x) data cache.
Invalidate-
If the block containing the byte addressed by x exists in the data cache,
DataCache-
the block is removed from the data cache.
Block(x)
If little-endian storage (see Section 6.2.5.5 on page 136), the byte at ad-
dress x is the least-significant byte and the byte at address x+y–1 is the
most-significant byte of the value being accessed.
MOD(x,y) Modulo y of x (remainder of x divided by y).
ROTL64(x, y) Result of rotating the 64-bit value x left y positions
ROTL32(x, y) Result of rotating the 64-bit value x||x left y positions, where x is 32 bits
long
SINGLE(x) Result of converting x from floating-point double format to floating-point
single format, using the model shown on page 100.
SPREG(x) Special Purpose Register x
TRAP Invoke a Trap type Program interrupt
undefined An undefined value. The value may vary between implementations, and
between different executions on the same implementation.
CIA Current Instruction Address, which is the 64-bit address of the instruc-
tion being described by a sequence of RTL. Used by relative branches to
set the Next Instruction Address (NIA), and by Branch instructions with
LK=1 to set the Link Register. CIA does not correspond to any architected
register.
The precedence rules for RTL operators are summarized in Table 1-1. Operators
higher in the table are applied before those lower in the table. Operators at the
same level in the table associate from left to right, from right to left, or not at all,
as shown. (For example, – associates from left to right, so a–b–c = (a–b)–c.) Paren-
theses are used to override the evaluation order implied by the table or to increase
clarity: parenthesized expressions are evaluated before serving as operands.
Operators Associativity
subscript, function evaluation left to right
pre-superscript (replication), right to left
post-superscript (exponentiation)
unary –, ¬ right to left
×, ÷ left to right
+, – left to right
|| left to right
=, ≠, <, ≤, >, ≥, <u, >u, ? left to right
&, ⊕, ≡ left to right
| left to right
: (range) none
← none
The architecture defines the instruction set, the storage model, interrupt action,
and other facilities. Instructions that the processor can execute fall into several
classes:
There are no computational instructions that reference storage, except Load Half
Algebraic, Load Floating-Point Single, Load Floating-Point Double, and Store Float-
ing-Point Single. Normally, to use a storage operand in a computation and then
modify the same or another storage location, the contents of storage must be
loaded into a register, modified, and then stored back to the target location.
Figure 1-3 shows the user-mode registers of Book E. Figure 1-4 shows the signifi-
cant supervisor-mode registers of Book E. Figure 1-6 shows the interrupt-specific
registers of Book E. Figure 1-7 shows the storage control-specific register of
Book E. Figure 1-8 shows the timer-specific registers of Book E. Figure 1-9 shows
the debug-specific registers of Book E. Note that bits for 32-bit registers are num-
bered 32:63 rather than 0:31 to indicate their true bit alignment with respect to
64-bit registers. 32-bit registers can be correctly interpreted as 64-bit registers
with bits 0:31 permanently reserved.
GPR 0
GPR 1
...
... General Purpose Registers (page 53)
GPR 30
GPR 31
FPR 0
FPR 1
...
... Floating-Point Registers (page 69)
FPR 30
FPR 31
0 63
SPRG3
SPRG4
SPRG7
0 32 63
SPRG0
SPRG1
SPRG2
SPRG3
Software-use SPRs (page 42)
SPRG4
SPRG5
SPRG6
SPRG7
0 32 63
1. SPRG3 user-mode accessibility is implementation-dependent. SPRG4, SPRG5, SPRG6, and SPRG7 are user-mode read-
access only.
SRR1
Save/Restore Register 1 (page 144)
32 63
IVOR0
IVOR1
:
Interrupt Vector Offset Registers (page 147)
:
IVOR14
IVOR15
32 63
All instructions to be executed are four bytes long and word-aligned in storage.
Thus, whenever instruction addresses are presented to the processor (as in
Branch and Branch Extended instructions) the two low-order bits are treated as
0s. Similarly, whenever the processor develops an instruction address its two low-
order bits are zero.
Bits 0:5 always specify the primary opcode (OPCD, below). Many instructions also
have an extended opcode (XO, below). The remaining bits of the instruction con-
tain one or more fields as shown below for the different instruction formats.
The format diagrams given below show horizontally all valid combinations of
instruction fields.
In some cases an instruction field occupies more than one contiguous sequence of
bits, or occupies one contiguous sequence of bits that are used in permuted order.
Such a field is called a split field. In the format diagrams given below and in the
individual instruction layouts, the name of a split field is shown in lower-case let-
ters, once for each of the contiguous sequences, each with their respective bit
numbering. In the RTL description of an instruction having a split field, and in
certain other places where individual bits of a split field are identified, the name of
the field in upper-case letters represents the bit-ordered concatenation of the
sequences.
OPCD BO BI BD AALK
0 6 11 16 30 31
OPCD BF / L RA SI
OPCD BF / L RA UI
OPCD FRS RA D
OPCD FRT RA D
OPCD RS RA D
OPCD RS RA UI
OPCD RT RA D
OPCD RT RA SI
OPCD TO RA SI
0 6 9 10 11 16 31
OPCD RS RA DE XO
OPCD RS RA DES XO
OPCD RT RA DE XO
OPCD RT RA DES XO
0 6 11 16 28 31
OPCD LI AALK
0 6 30 31
OPCD RS RA RB MB ME Rc
OPCD RS RA SH MB ME Rc
0 6 11 16 21 26 31
OPCD /// XO /
OPCD ??? XO /
OPCD /// RA RB XO /
OPCD ??? RA RB XO ?
OPCD BF /// XO /
OPCD BF /// U / XO Rc
OPCD BF / L RA RB XO /
OPCD BT /// XO Rc
OPCD CT RA RB XO /
OPCD FRS RA RB XO /
OPCD FRT RA RB XO /
OPCD MO /// XO /
OPCD RS RA /// XO Rc
OPCD RS RA /// XO /
OPCD RS RA RB XO /
OPCD RS RA RB XO Rc
OPCD RS RA RB XO 1
OPCD RS RA NB XO /
OPCD RS RA SH XO Rc
OPCD RT /// XO /
OPCD RT RA /// XO /
OPCD RT RA /// XO Rc
OPCD RT RA RB XO /
OPCD RT RA RB XO Rc
OPCD RT RA NB XO /
OPCD TO RA RB XO /
0 6 9 10 11 14 15 16 18 20 21 31
OPCD /// 1 /
0 6 30 31
OPCD RS / FXM / XO /
OPCD /// XO /
OPCD BO BI /// XO LK
OPCD BT BA BB XO /
0 6 9 11 14 16 21 31
AA (30)
Absolute Address bit.
BA (11:15)
Field used to specify a bit in the Condition Register to be used as a source.
BB (16:20)
Field used to specify a bit in the Condition Register to be used as a source.
BD (16:29)
Immediate field specifying a 14-bit signed two's complement branch
displacement which is concatenated on the right with 0b00 and sign-extended
to 64 bits.
BF (6:8)
Field used to specify one of the Condition Register fields or one of the
Floating-Point Status and Control Register fields to be used as a target.
BFA (11:13)
Field used to specify one of the Condition Register fields or one of the
Floating-Point Status and Control Register fields to be used as a source.
BI (11:15)
Field used to specify a bit in the Condition Register to be used as the
condition of a Branch Conditional instruction.
BT (6:10)
Field used to specify a bit in the Condition Register or in the Floating-Point
Status and Control Register to be used as a target.
CT (6:10)
Field used by the Cache Touch instructions (dcbt[e], dcbtst[e], and icbt[e]) to
specify the target portion of the cache facility to place the prefetched data or
instructions and is implementation-dependent.
D (16:31)
Immediate field used to specify a 16-bit signed two's complement integer
which is sign-extended to 64 bits.
dcrn(16:20||11:15)
Field used to specify a Device Control Register for the mtdcr and mfdcr
instructions.
DE (16:27)
Immediate field used to specify a 12-bit signed two's complement integer
which is sign-extended to 64 bits.
DES (16:27)
Immediate field used to specify a 12-bit signed two's complement integer
which is concatenated on the right with 0b00 and sign-extended to 64 bits.
E (15)
Immediate field used to specify a 1-bit value used by wrteei to place in the EE
(External Input Enable) bit of the Machine State Register.
FLM (7:14)
Field mask used to identify the Floating-Point Status and Control Register
fields that are to be updated by the mtfsf instruction.
FRA (11:15)
Field used to specify a Floating-Point Register to be used as a source.
FRB (16:20)
Field used to specify a Floating-Point Register to be used as a source.
FRC (21:25)
Field used to specify a Floating-Point Register to be used as a source.
FRS (6:10)
Field used to specify a Floating-Point Register to be used as a source.
FRT (6:10)
Field used to specify a Floating-Point Register to be used as a target.
FXM (12:19)
Field mask used to identify the Condition Register fields that are to be
updated by the mtcrf instruction.
L (10)
Field used to specify whether a integer Compare instruction is to compare 64-
bit numbers or 32-bit numbers.
LK (31)
LINK bit.
mb (26 || 21:25)
Field used in MD-form and MDS-form Rotate instructions to specify the first
1-bit of a 64-bit mask, as described in Section 4.3.7 on page 63.
me (26 || 21:25)
Field used in MD-form and MDS-form Rotate instructions to specify the last
1-bit of a 64-bit mask, as described in Section 4.3.7 on page 63.
MO (6:10)
Field used to specify the subset of storage accesses that are ordered by the
Memory Barrier instruction.
NB (16:20)
Field used to specify the number of bytes to move in an immediate Move
Assist instruction.
OPCD (0:5)
Primary opcode field.
RA (11:15)
Field used to specify a General Purpose Register to be used as a source or as
a target.
RB (16:20)
Field used to specify a General Purpose Register to be used as a source.
Rc (31)
RECORD bit.
RS (6:10)
Field used to specify a General Purpose Register to be used as a source.
RT (6:10)
Field used to specify a General Purpose Register to be used as a target.
SH (16:20)
Field used to specify a shift amount in Rotate Word Immediate and Shift Word
Immediate instructions.
SI (16:31)
Immediate field used to specify a 16-bit signed integer.
sprn (16:20||11:15)
Field used to specify a Special Purpose Register for the mtspr and mfspr
instructions.
TO (6:10)
Field used to specify the conditions on which to trap. The encoding is
described in Section 4.3.6 on page 62.
U (16:19)
Immediate field used as the data to be placed into a field in the Floating-Point
Status and Control Register.
UI (16:31)
Immediate field used to specify a 16-bit unsigned integer.
WS (18:20)
Field used to specify a word in the Translation Lookaside Buffer entry being
accessed.
An instruction falls into exactly one of the following four classes, which is deter-
mined by examining the primary opcode, and the extended opcode, if any.
One exception to this is that, for implementations which only provide the 32-bit
subset of Book E, it is not expected (and likely not even possible) that emulation of
the 64-bit behavior of the defined instructions will be provided by the system. See
Appendix A, “Guidelines for 32-bit Book E”, on page 371.
• perform the actions described in the rest of this document, if the instruction
is recognized and supported by the implementation. The architected behavior
may cause other exceptions.
Allocated instructions are allocated to purposes that are outside the scope of
Book E for implementation-dependent and application-specific use.
• perform the actions described in the User’s Manual for the implementation.
The implementation-dependent behavior may cause other exceptions.
Note
Some allocated instructions may have associated new process state, and, therefore, may
provide an associated enable bit, similar in function to MSRFP for floating-point instruc-
tions. ‘Enabled for execution’ for these instructions implies any associated enable bit is
set to allow, or enable, execution of these instructions. For these allocated instructions,
the architecture provides an Auxiliary Processor Unavailable interrupt vector (see
Section 7.6.10 on page 166) in the event execution of any of these instructions is
attempted when not ‘enabled for execution’.
Other allocated instructions may not have any associated new state and therefore may
not require an associated enable bit. These instructions are assumed to always be
‘enabled for execution’ if they are supported by the implementation.
This class of instructions consists of all instruction primary opcodes (and associ-
ated extended opcodes, if applicable) which do not belong to either the defined,
allocated, or preserved instruction classes.
Reserved instructions are available for future extensions of Book E. That is, some
future version of Book E may define any of these instructions to perform new
functions or make them available for implementation-dependent use as allocated
instructions. There are two types of reserved instructions, reserved-illegal and
reserved-nop.
Engineering Note
Implementations are strongly encouraged to support reserved-nop instructions as a true
no-operation instruction.
There is one defined instruction that has a preferred form. The Or Immediate
instruction is the preferred form for expressing a no-operation.
Some of the defined instructions have invalid forms. An instruction form is invalid
if one or more fields of the instruction, excluding the opcode field(s), are coded
incorrectly in a manner that can be deduced by examining only the instruction
encoding.
Any attempt to execute an invalid form of an instruction will either cause an Ille-
gal Instruction type Program interrupt or yield boundedly undefined results.
Exceptions to this rule are stated in the instruction descriptions.
Some kinds of invalid form instructions can be deduced just from examining the
instruction layout. These are listed below.
Engineering Note
Causing an Illegal Instruction type Program interrupt if an attempt is made to execute
an invalid form of an instruction facilitates the debugging of software.
One exception is that Book E strongly recommends that hardware ignore bit 31 of the
instruction encoding for X-form Storage Access instructions rather than causing an Ille-
gal Instruction type Program interrupt when this reserved bit is set to 1. This facilitates
subsequent definition of bit 31 for performance enhancement extensions to the architec-
ture while remaining functionally compatible with implementations of previous versions
of Book E.
1.10 Optionality
A program references storage using the effective address computed by the proces-
sor when it executes a Storage Access or Branch instruction (or certain other
instructions described in Section 6.3.2 on page 139, and Section 6.3.3 on
page 142), or when it fetches the next sequential instruction.
Bytes in storage are numbered consecutively starting with 0. Each number is the
address of the corresponding byte.
Storage operands may be bytes, halfwords, words, or doublewords, or, for the
Load Multiple, Store Multiple, Load String and Store String instructions, a sequence
of words or bytes. The address of a storage operand is the address of its first byte
(i.e., of its lowest-numbered byte). Byte ordering can be either big-endian or little-
endian (see Section 1.11.3 on page 33).
Operand length is implicit for each instruction with respect to storage alignment.
The operand of a scalar Storage Access instruction has a ‘natural’ alignment
boundary equal to the operand length. In other words, the ‘natural’ address of an
operand is an integral multiple of the operand length. A storage operand is said to
be aligned if it is aligned at its natural boundary: otherwise it is said to be
unaligned.
Storage operands for single-register Storage Access instructions have the follow-
ing characteristics.
Operand
Operand Addr60:63 if aligned
Length
Byte (or String) 8 bits xxxx
Halfword 2 bytes xxx0
Word 4 bytes xx00
Doubleword 8 bytes x000
An ‘x’ in an address bit position indicates that the bit can be 0 or 1 independent of
the state of other bits in the address.
The concept of alignment is also applied more generally, to any datum in storage.
For example, a 12-byte datum in storage is said to be word-aligned if its address
is an integral multiple of 4.
The 64-bit address computed by the processor when executing a Storage Access
or Branch instruction (or certain other instructions described in Section 6.3.2 on
page 139, and Section 6.3.3 on page 142), or when fetching the next sequential
instruction, is called the effective address and specifies a byte in storage. For a
Storage Access instruction, if the sum of the effective address and the operand
length exceeds the maximum effective address, the storage access is considered to
be undefined.
Effective address arithmetic, except for next sequential instruction address com-
putations, wraps around from the maximum address, 264–1, to address 0.
Programming Note
While some implementations may support next sequential instruction address computa-
tions wrapping from the highest address 0xFFFF_FFFF_FFFF_FFFC to
0x0000_0000_0000_0000 as part of the instruction flow, portable software is strongly
encouraged not to depend on this. If code must span this boundary, software should
place a non-linking branch at address 0xFFFF_FFFF_FFFF_FFFC which always
branches to address 0x0000_0000_0000_0000 (either absolute or relative branches will
work). See also Section A.1.3 on page 372.
If scalars (individual data items and instructions) were indivisible, there would be
no such concept as “byte ordering.” It is meaningless to consider the order of bits
or groups of bits within the smallest addressable unit of storage, because nothing
can be observed about such order. Only when scalars, which the programmer and
processor regard as indivisible quantities, can comprise more than one address-
able unit of storage does the question of order arise.
For a machine in which the smallest addressable unit of storage is the 64-bit dou-
bleword, there is no question of the ordering of bytes within doublewords. All
transfers of individual scalars between registers and storage are of doublewords,
and the address of the byte containing the high-order eight bits of a scalar is no
different from the address of a byte containing any other part of the scalar.
Given a scalar that contains multiple bytes, the choice of byte ordering is essen-
tially arbitrary. There are 4! = 24 ways to specify the ordering of four bytes within
a word, but only two of these orderings are sensible:
• The ordering that assigns the lowest address to the highest-order (‘left-most’)
eight bits of the scalar, the next sequential address to the next-highest-order
eight bits, and so on.
• The ordering that assigns the lowest address to the lowest-order (‘right-most’)
eight bits of the scalar, the next sequential address to the next-lowest-order
eight bits, and so on.
Book E provides support for both big-endian and little-endian byte ordering in the
form of a storage attribute. See Section 6.2.5 on page 132 and Section 6.2.5.5 on
page 136.
C structure mapping rules permit the use of padding (skipped bytes) to align sca-
lars on desirable boundaries. The structure mapping examples below show each
scalar aligned at its natural boundary. This alignment introduces padding of four
bytes between a and b, one byte between d and e, and two bytes between e and f.
The same amount of padding is present in both big-endian and little-endian map-
pings.
Big-Endian Mapping
The big-endian mapping of structure s follows. (The data is in boldface print in the
structure mappings. Addresses, in hexadecimal, are below the data stored at the
address. The contents of each byte, as defined in structure s, is shown as a (hexa-
decimal) number or character (for the string elements).
11 12 13 14
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
21 22 23 24 25 26 27 28
0x08 0x09 0x0A 0x0B 0x0C 0x0D 0x0E 0x0F
31 32 33 34 'A' 'B' 'C' 'D'
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
'E' 'F' 'G' 51 52
0x18 0x19 0x1A 0x1B 0x1C 0x1D 0x1E 0x1F
61 62 63 64
0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27
Little-Endian Mapping
14 13 12 11
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07
28 27 26 25 24 23 22 21
0x08 0x09 0x0A 0x0B 0x0C 0x0D 0x0E 0x0F
34 33 32 31 'A' 'B' 'C' 'D'
0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17
'E' 'F' 'G' 52 51
0x18 0x19 0x1A 0x1B 0x1C 0x1D 0x1E 0x1F
64 63 62 61
0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27
MSB LSB
0x00 0x01 0x02 0x03
LSB MSB
0x00 0x01 0x02 0x03
If storage is reprogrammed from one endian format to the other, the contents of
the storage must be reloaded with program and data structures in the appropriate
Endian format. If the contents of instruction memory change, the instruction
cache must be made coherent with the updates. The instruction cache must be
invalidated and the updated memory contents must be fetched in the new Endian
format so that the proper byte ordering occurs in the event that this byte reversal
is performed between the memory interface and the cache.
• The word a has its four bytes reversed within the word spanning addresses
0x00–0x03.
• The halfword e has its two bytes reversed within the halfword spanning
addresses 0x1C–0x1D.
Note that the array of bytes d, where each data item is a byte, is not reversed
when the big-endian and little-endian mappings are compared. For example, the
character ‘A’ is located at address 0x14 in both the big-endian and little-endian
mappings.
The size of the data item being loaded or stored must be known before the proces-
sor can decide whether, and if so, how to reorder the bytes when moving them
between a register and storage.
• For byte loads and stores, including strings, no reordering of bytes occurs.
• For halfword loads and stores, bytes may be reversed within the halfword,
depending on the byte order.
• For word loads and stores, bytes may be reversed within the word, depending
on the byte order.
• For doubleword loads and stores, bytes may be reversed within the double-
word, depending on the byte order.
For example, when loading a data word from storage, all four bytes of the word are
retrieved from memory (or the data cache) starting with the byte at the calculated
effective address and continuing with the next three higher numbered bytes.
Then, the bytes are placed in the register so that the byte from either the highest
address or the lowest address is placed in the least-significant byte of the register
for big-endian or little-endian storage, respectively.
The function of the Load Byte-Reverse and Store Byte-Reverse instructions is use-
ful when a particular page in storage contains some data written in big-endian
ordering and other data written in little-endian ordering. In such an environment,
the Endianness storage attribute for the page would be set according to the pre-
dominant byte ordering for the page, and ‘normal’, non-byte-reverse Load and
Store instructions would be used to access data operands which used this pre-
dominant byte ordering. Conversely, Load Byte-Reverse and Store Byte-Reverse
instructions would be used to access the data operands which used the other byte
ordering.
The synchronization described in this section refers to the state of the processor
that is performing the synchronization.
1. The operation is not initiated or, in the case of isync, does not complete, until
all instructions already in execution have completed to a point at which they
have reported all exceptions they will cause.
2. The instructions that precede the operation complete execution in the context
(privilege, address space, storage protection, etc.) in which they were initiated.
4. The instructions that follow the operation will be fetched and executed in the
context established by the operation as required by the sequential execution
model. (This requirement dictates that any prefetched instructions be dis-
carded and that any effects and side effects of executing them out-of-order
also be discarded, except as described in Section 6.1.5 on page 112.)
The Machine State Register (MSR) is a 32-bit register. Machine State Register bits
are numbered 32 (most-significant bit) to 63 (least-significant bit). This register
defines the state of the processor (i.e. enabling and disabling of interrupts and
debugging exceptions, selection of address space for instruction and data storage
accesses, and specifying whether the processor is in supervisor or user mode).
The Machine State Register contents are automatically saved, altered, and
restored by the interrupt-handling mechanism, as described in Section 7.5 on
page 151. If a non-critical interrupt is taken, the contents of the Machine State
Register are automatically copied into Save/Restore Register 1. If a critical inter-
rupt is taken, the contents of the Machine State Register are automatically copied
into Critical Save/Restore Register 1. When an rfi or rfci is executed, the con-
tents of the Machine State Register are restored from Save/Restore Register 1 or
Critical Save/Restore Register 1, respectively.
The contents of the Machine State Register can be read into bits 32:63 of a GPR
using mfmsr RT, setting bits 0:31 of GPR(RT) to an undefined value. The contents
of bits 32:63 of GPR(RS) can be written to the Machine State Register using mtmsr
RS. MSREE may be set or cleared atomically using wrtee or wrteei.
Bit(s) Description
32:37 Reserved
38 Allocated for implementation-dependent use
39:44 Reserved
Programming Note
An Machine State Register bit that is reserved may be altered by rfi/rfci.
The contents of the Processor Identification Register can be read into bits 32:63 of
GPR(RT) using mfspr RT,PIR, setting bits 0:31 of GPR(RT) to an undefined value.
The means by which the Processor Identification Register is initialized are imple-
mentation-dependent (see User’s Manual).
The Processor Version Register (PVR) is a 32-bit read-only register. Processor Ver-
sion Register bits are numbered 32 (most-significant bit) to 63 (least-significant
bit). The Processor Version Register contains a value identifying the version and
revision level of the processor.
The contents of the Processor Version Register can be read into bits 32:63 of
GPR(RT) using mfspr RT,PVR, setting bits 0:31 of GPR(RT) to an undefined value.
Write access to the Processor Version Register is not provided.
Bit(s) Description
32:47 Version
A 16-bit number that identifies the version of the processor. Different version
numbers indicate major differences between processors, such as which optional
facilities and instructions are supported.
48:63 Revision
A 16-bit number that distinguishes between implementations of the version. Dif-
ferent revision numbers indicate minor differences between processors having the
same version number, such as clock rate and Engineering Change level.
Version numbers are assigned by Book E process. Revision numbers are assigned
by an implementation-defined process.
The following are examples of differences that generally should be considered ‘minor’.
In general, any change to a processor should cause a new PVR value to be assigned.
Even a seemingly trivial change that is not expected to be apparent to software should
cause a new revision number to be assigned, in case the change is later discovered to
have introduced an error that software must circumvent.
SPRG3
This 64-bit register can be written only in supervisor mode. SPRG3 can be
read in supervisor mode. It is implementation-dependent whether or not
SPRG3 can be read in user mode. See the User Manual for the
implementation.
USPRG0
This 64-bit register can be accessed in supervisor or user mode.
The contents of SPRGi can be read into GPR(RT) using mfspr RT,SPRGi. The con-
tents of GPR(RS) can be written into SPRGi using mtspr SPRGi,RS.
The contents of USPRG0 can be read into GPR(RT) using mfspr RT,USPRG0. The
contents of GPR(RS) can be written into USPRG0 using mtspr USPRG0,RS.
Device Control Registers (DCRs) are on-chip registers that exist architecturally
outside the processor core and thus are not actually part of Book E. Book E sim-
ply defines the existence of a Device Control Register ‘address space’ and the
instructions to access them and does not define any particular Device Control
Registers themselves.
The contents of Device Control Register DCRN can be read into GPR(RT) using
mfdcr RT,DCRN. The contents of GPR(RS) can be written into Device Control Reg-
ister DCRN using mtdcr DCRN,RS.
sc, rfi, and rfci are system linkage instructions which enable the program to call
upon the system to perform a service (i.e. invoke a System Call interrupt), and by
which the system can return from performing a service or from processing an
interrupt. System linkage instructions are context synchronizing, as defined in
Section 1.12.1 on page 38.
This section describes the registers and instructions that make up the branch and
Condition Register operations facilities of Book E.
The Condition Register (CR) is a 32-bit register. Condition Register bits are num-
bered 32 (most-significant bit) to 63 (least-significant bit). The Condition Register
reflects the result of certain operations, and provides a mechanism for testing
(and branching).
The bits in the Condition Register are grouped into eight 4-bit fields, named CR
Field 0 (CR0), CR Field 1 (CR1),..., and CR Field 7 (CR7), which are set in one of
the following ways.
• A specified Condition Register field can be set as the result of either an inte-
ger or a floating-point Compare instruction.
If any portion of the result is undefined, then the value placed into the first three
bits of CR Field 0 is undefined.
CR Bit Description
32 Negative (LT)
Bit 32 of the result is equal to 1.
33 Positive (GT)
Bit 32 of the result is equal to 0 and at least one of bits 33:63 of the
result is non-zero.
34 Zero (EQ)
Bits 32:63 of the result are equal to 0.
Programming Note
CR Field 0 may not reflect the ‘true’ (infinitely precise) result if overflow occurs: see Sec-
tion 4.3.3, “Integer Arithmetic Instructions”, on page 59.
CR Bit Description
CR Bit Description
The Link Register (LR) is a 64-bit register. Link Register bits are numbered 0
(most-significant bit) to 63 (least-significant bit). The Link Register can be used to
provide the branch target address for the Branch Conditional to Link Register
instruction, and it holds the return address after Branch and Link instructions.
The contents of the Link Register can be read into a GPR using mfspr RT,LR. The
contents of GPR(RS) can be written to the Link Register using mtspr LR,RS.
The Count Register (CTR) is a 64-bit register. Count Register bits are numbered 0
(most-significant bit) to 63 (least-significant bit). Bits 32:63 of the Count Register
can be used to hold a loop count that can be decremented during execution of
Branch instructions that contain an appropriately encoded BO field. If the value in
bits 32:63 of the Count Register is 0 before being decremented, it is –1 afterward
and bits 0:31 are left unchanged. The entire 64-bit Count Register can also be
used to provide the branch target address for the Branch Conditional to Count Reg-
ister instruction.
The contents of the Count Register can be read into a GPR using mfspr RT,CTR.
The contents of GPR(RS) can be written to the Count Register using mtspr CTR,RS.
The Branch instructions compute the effective address (EA) of the target in one of
the following four ways, as described in Section 1.11.2.2, “Instruction Storage
Addressing Modes”, on page 32.
3. Using the address contained in the Link Register (Branch Conditional to Link
Register).
For the first two methods, the target addresses can be computed sufficiently
ahead of the Branch instruction that instructions can be prefetched along the tar-
get path. For the third and fourth methods, prefetching instructions along the tar-
get path is also possible provided the Link Register or the Count Register is loaded
sufficiently ahead of the Branch instruction.
The encoding for the BO field is as follows. If the BO field specifies that the Count
Register is to be decremented, bits 32:63 of the Count Register are decremented. If
the BO field specifies a condition that must be TRUE or FALSE, that condition is
obtained from the contents of bit BI+321 of the Condition Register.
1. Note that bits in the Condition Register are numbered 32:63 and that ‘BI’ refers to the BI field in the Branch
instruction encoding. For example, specifying BI=2 refers to bit 34 of the Condition Register.
BO Description
0000y Decrement CTR32:63, then branch if the decremented CTR32:63≠0 and
the condition is FALSE.
0001y Decrement CTR32:63, then branch if the decremented CTR32:63=0 and
the condition is FALSE.
001zy Branch if the condition is FALSE.
0100y Decrement CTR32:63, then branch if the decremented CTR32:63≠0 and
the condition is TRUE.
0101y Decrement CTR32:63, then branch if the decremented CTR32:63=0 and
the condition is TRUE.
011zy Branch if the condition is TRUE.
1z00y Decrement CTR32:63, then branch if the decremented CTR32:63≠0.
1z01y Decrement CTR32:63, then branch if the decremented CTR32:63=0.
1z1zz Branch always.
The ‘y’ bit provides a hint about whether a conditional branch is likely to be taken,
and may be used by some implementations to improve performance.
The ‘branch always’ encoding of the BO field does not have a ‘y’ bit.
For Branch Conditional instructions that have a ‘y’ bit, using y=0 indicates that
the following behavior is likely.
• If the instruction is bc, bcl, bca, bcla, bce, bcel, bcea, or bcela with a nega-
tive value in the displacement field, the branch is taken.
• In all other cases (bc, bcl, bca, bcla, bce, bcel, bcea, or bcela with a nonneg-
ative value in the displacement field, bclr, bclrl, bclre, bclrel, bcctr, bcctrl,
bcctre, or bcctrel), the branch falls through (is not taken).
The displacement field is used as described above even if the target is an absolute
address.
Programming Note
The ‘z’ bits should be set to 0, as they may be assigned a meaning in some future ver-
sion of the architecture.
The default value for the ‘y’ bit should be 0: the value 1 should be used only if software
has determined that the prediction corresponding to y=1 is more likely to be correct than
the prediction corresponding to y=0.
Here ‘s’ is bit 16 of the instruction, which is the sign bit of the displacement field if the
instruction has a displacement field and is 0 otherwise. BO4 is the ‘y’ bit, or a bit that is
ignored for the ‘branch always’ encoding of the BO field. (Advantage is taken of the fact
that, for bclr, bclrl, bclre, bclrel, bcctr, bcctrl, bcctre, or bcctrel, bit 16 of the instruc-
tion is part of a reserved field and therefore must be 0.)
Programming Note
In some implementations the processor may keep a stack of the Link Register values
most recently set by Branch and Link instructions, with the possible exception of the
form shown below for obtaining the address of the next instruction. To benefit from this
stack, the following programming conventions should be used.
bcl 20,31,$+4
• Loop counts:
Keep them in the Count Register, and use one of the Branch Conditional instructions
to decrement the count and to control branching (e.g., branching back to the start of
a loop if the decremented counter value is nonzero).
Use the Count Register to hold the address to branch to, and use the bcctr
instruction (LK=0) to branch to the selected address.
– A calls B: use a Branch instruction that sets the Link Register (LK=1).
– B returns to A: use the bclr instruction (LK=0) (the return address is in, or can
be restored to, the Link Register).
Here A calls Glue, Glue calls B, and B returns to A rather than to Glue. (Such a
calling sequence is common in linkage code used when the subroutine that the
programmer wants to call, here B, is in a different module from the caller: the Binder
inserts ‘glue’ code to mediate the branch.) The three branches should be as follows.
– A calls Glue: use a Branch instruction that sets the Link Register (LK=1).
– Glue calls B: place the address of B into the Count Register, and use the bcctr
instruction (LK=0).
– B returns to A: use the bclr instruction (LK=0) (the return address is in, or can
be restored to, the Link Register).
This chapter describes the registers and instructions that make up the integer
operations. Section 4.2 describes the registers associated with the integer opera-
tions. Section 4.3 describes the instructions associated with integer operations.
The Integer Exception Register (XER) is a 64-bit register. Table 4-1 provides bit
definitions for the Integer Exception Register.
Integer Exception Register bits are set based on the operation of an instruction
considered as a whole, not on intermediate results (e.g., the Subtract From Carry-
ing instruction, the result of which is specified as the sum of three values, sets
bits in the Integer Exception Register based on the entire operation, not on an
intermediate sum).
Bit(s) Description
0 Summary Overflow 64 (SO64)
The Summary Overflow 64 bit is set to 1 whenever an instruction (except mtspr) sets
the Overflow 64 bit to 1. Once set to 1, the SO64 bit remains set until it is cleared
by an mtspr instruction (specifying the Integer Exception Register) or an mcrxr in-
struction. The SO64 bit is not altered by Compare instructions, nor by other instruc-
tions (except mtspr to the Integer Exception Register, and mcrxr64) that cannot
overflow. Executing an mtspr instruction to the Integer Exception Register, supply-
ing the values 0 for SO64 and 1 for OV64, causes SO64 to be set to 0 and OV64 to
be set to 1.
1 Overflow 64 (OV64)
The Overflow 64 bit is set to indicate that an overflow has occurred during execution
of an instruction. X-form Add, Subtract From, and Negate instructions having OE=1
set OV64 to 1 if the carry out of bit 0 is not equal to the carry out of bit 1, and set
OV64 to 0 otherwise. This condition reflects a signed overflow. XO-form Multiply Low
Doubleword and Divide Doubleword instructions having OE=1 set OV64 to 1 if the
result cannot be represented in 64 bits (mulld, divd, divdu), and set OV64 to 0 oth-
erwise. The OV64 bit is not altered by Compare instructions, nor by other instruc-
tions (except mtspr to the Integer Exception Register, and mcrxr64) that cannot
overflow.
2 Carry 64 (CA64)
The Carry 64 bit is set as follows during execution of certain instructions. Add Car-
rying, Subtract From Carrying, Add Extended, and Subtract From Extended instruc-
tions set CA64 to 1 if there is a carry out of bit 0, and set CA64 to 0 otherwise. CA64
can be used to indicate unsigned overflow for add and subtract operations that set
CA64. Shift Right Algebraic Doubleword instructions set CA64 to 1 if any 1-bits have
been shifted out of a negative operand, and set CA64 to 0 otherwise. The CA64 bit
is not altered by Compare instructions, nor by other instructions (except Shift Right
Algebraic, mtspr to the Integer Exception Register, and mcrxr64) that cannot carry.
3:31 Reserved
32 Summary Overflow (SO)
The Summary Overflow bit is set to 1 whenever an instruction (except mtspr) sets
the Overflow bit. Once set, the SO bit remains set until it is cleared by an mtspr in-
struction (specifying the Integer Exception Register) or an mcrxr instruction. The SO
bit is not altered by Compare instructions, nor by other instructions (except mtspr
to the Integer Exception Register, and mcrxr) that cannot overflow. Executing an
mtspr instruction to the Integer Exception Register, supplying the values 0 for SO
and 1 for OV, causes SO to be set to 0 and OV to be set to 1.
33 Overflow (OV)
The Overflow bit is set to indicate that an overflow has occurred during execution of
an instruction. X-form Add, Subtract From, and Negate instructions having OE=1 set
OV to 1 if the carry out of bit 32 is not equal to the carry out of bit 33, and set OV
to 0 otherwise. This condition reflects a signed overflow. X-form Multiply Low Word
and Divide Word instructions having OE=1 set OV to 1 if the result cannot be repre-
sented in 32 bits (mullw, divw, divwu), and set OV to 0 otherwise. The OV bit is not
altered by Compare instructions, nor by other instructions (except mtspr to the In-
teger Exception Register, and mcrxr) that cannot overflow.
34 Carry (CA)
The Carry bit is set as follows, during execution of certain instructions. Add Carry-
ing, Subtract From Carrying, Add Extended, and Subtract From Extended instructions
set it to 1 if there is a carry out of bit 32, and set it to 0 otherwise. CA can be used
to indicate unsigned overflow for add and subtract operations that set CA. Shift Right
Algebraic Word instructions set CA to 1 if any 1-bits have been shifted out of a neg-
ative operand, and set CA to 0 otherwise. The CA bit is not altered by Compare in-
structions, nor by other instructions (except Shift Right Algebraic Word, mtspr to the
Integer Exception Register, and mcrxr) that cannot carry.
The Integer Load instructions compute the effective address (EA) of the storage to
be accessed as described in Section 1.11.2, “Effective Address Calculation”, on
page 31.
Many of the Integer Load instructions have an ‘update’ form, in which GPR(RA) is
updated with the effective address. For these forms, if RA≠0 and RA≠RT, the effec-
tive address is placed into GPR(RA) and the storage element (byte, halfword, word,
or doubleword) addressed by EA is loaded into GPR(RT). If RA=0 or RA=RT, the
instruction form is invalid.
Integer Load storage accesses will cause a Data Storage interrupt if the program is
not allowed to read the storage location. Integer Load storage accesses will cause a
Data TLB Error interrupt if the program attempts to access storage that is
unavailable (.e. not currently mapped by the TLB).
Programming Note
In some implementations, the Load Halfword Algebraic and ‘with update’ Integer Load
instructions may have greater latency than other types of Load instructions. Moreover,
‘with update’ Integer Load instructions may take longer to execute in some implementa-
tions than the corresponding pair of a non-update Load instruction and an Add
instruction.
Programming Note
The DES field in DE-form Integer Load instructions is a word offset, not a byte offset like
the DE field in DE-form Integer Load instructions and D field in D-form Integer Load
instructions. However, for programming convenience, assemblers should support the
specification of byte offsets for both forms of instruction.
Engineering Note
Implementations are strongly recommended to ignore bit 31 of instruction encodings for
X-form Integer Load instructions.
The Integer Store instructions compute the effective address (EA) of the storage to
be accessed as described in Section 1.11.2, “Effective Address Calculation”, on
page 31.
The contents of GPR(RS) are stored into the byte, halfword, word, or doubleword
in storage addressed by EA.
Many of the Integer Store instructions have an ‘update’ form, in which GPR(RA) is
updated with the effective address. For these forms, the following rules apply.
• If RS=RA, the contents of GPR(RS) are copied to the target storage element
and then EA is placed into GPR(RA).
Integer Store storage accesses will cause a Data Storage interrupt if the program is
not allowed to write to the storage location. Integer Store storage accesses will
cause a Data TLB Error interrupt if the program attempts to access storage that is
unavailable.
Programming Note
The DES field in DE-form Integer Store instructions is a word offset, not a byte offset like
the DE field in DE-form Integer Store instructions and D field in D-form Integer Store
instructions. However, for programming convenience, assemblers should support the
specification of byte offsets for both forms of instruction.
The integer arithmetic instructions use the contents of the GPRs as source oper-
ands, and place results into GPRs, into status bits in the Integer Exception Regis-
ter, and into CR Field 0. addi and addis use the value 0, not the contents of
GPR(0), if RA=0.
The integer arithmetic instructions treat source operands as signed, two’s comple-
ment integers unless the instruction is explicitly identified as performing an
unsigned operation.
The X-form instructions with Rc=1, and the D-form instruction addic. set the
first three bits of CR Field 0 to characterize bits 32:63 of the result that is placed
in the target register. These bits are set by signed comparison of bits 32:63 of the
result to zero.
Programming Note
Instructions with the OE bit set or that set CA may execute slowly or may prevent the
execution of subsequent instructions until the instruction has completed.
The X-form Arithmetic instructions set SO and OV when OE=1 to reflect overflow
of bits 32:63 of the result. X-form Arithmetic instructions also set SO64 and OV64
when OE=1 to reflect overflow of bits 0:63 of the result.
Programming Note
Notice that CR Field 0 may not reflect the ‘true’ (infinitely precise) result if overflow
occurs.
Programming Note
addi, addis, add, and subf are the preferred instructions for addition and subtraction,
because they set few status bits.
The X-form Logical instructions with Rc=1, and the D-form Logical instructions
andi. and andis. set the first three bits of CR Field 0 as described in Section
4.3.3, “Integer Arithmetic Instructions”, on page 59. The Logical instructions do
not change the SO, OV, CA, SO64, OV64 and CA64 bits in the Integer Exception
Register.
The integer Compare instructions compare the contents of GPR(RA) with (1) the
sign-extended value of the SI field, (2) the zero-extended value of the UI field, or (3)
the contents of GPR(RB). The comparison is signed for cmpi and cmp, and
unsigned for cmpli and cmpl.
For 64-bit implementations, the L field controls whether the operands are treated
as 64-bit or 32-bit quantities, as follows:
L Operand length
0 32-bit operands
1 64-bit operands
When the operands are treated as 32-bit signed quantities, bit 32 of the register
(RA or RB) is the sign bit.
The Compare instructions set one bit in the left-most three bits of the designated
CR field to 1, and the other two to 0. The SO bit of the Integer Exception Register
is copied to bit 3 of the designated CR field.
The Trap instructions are provided to test for a specified set of conditions from
comparing the contents of one GPR with a second GPR or immediate data. If any
of the conditions tested by a Trap instruction are met, a Trap exception type Pro-
The contents of GPR(RA) are compared with either the sign-extended value of the
SI field or the contents of GPR(RB), depending on the Trap instruction. For tdi
and td, the entire contents of RA (and RB) participate in the comparison; for twi
and tw, only the contents of bits 32:63 of RA (and RB) participate in the compari-
son.
This comparison results in five conditions which are ANDed with TO. If the result
is not 0 the Trap exception type Program interrupt is invoked. These conditions
are as follows.
Instructions are provided that perform rotation operations on data from a GPR
and return the result, or a portion of the result, to a GPR.
The rotation operations rotate a 64-bit quantity left by a specified number of bit
positions. Bits that exit from position 0 enter at position 63.
For the first type, denoted rotate64 or ROTL64, the value rotated is the given 64-bit
value. The rotate64 operation is used to rotate a given 64-bit quantity.
For the second type, denoted rotate32 or ROTL32, the value rotated consists of two
copies of the given 32-bit value, one copy in bits 0:31 and the other in bits 32:63.
The rotate32 operation is used to rotate a given 32-bit quantity employing the 64-
bit rotator.
The Rotate and Shift instructions employ a mask generator. The mask is 64 bits
long, and consists of 1-bits from a start bit, mstart, through and including a stop
bit, mstop, and 0-bits elsewhere. The values of mstart and mstop range from 0 to
63. If mstart > mstop, the 1-bits wrap around from position 63 to position 0. Thus
the mask is formed as follows:
For instructions that use the rotate32 operation, the mask start and stop positions
are always in bits 32:63 of the mask.
The Rotate Word and Shift Word instructions with Rc=1 set the first three bits of
CR field 0 as described in Section 4.3.3, “Integer Arithmetic Instructions”, on
page 59. Rotate and Shift instructions do not change the OV, OV64, SO, and SO64
bits. Rotate and Shift instructions, except algebraic right shifts, do not change the
CA or CA64 bits.
• inserted into the target register under control of a mask (if a mask bit is 1 the associated bit of the
rotated data is placed into the target register, and if the mask bit is 0 the associated bit in the tar-
get register remains unchanged); or
• ANDed with a mask before being placed into the target register.
The Rotate Left instructions allow right-rotation of the contents of a register to be performed (in concept)
by a left-rotation of 64-n, where n is the number of bits by which to rotate right. They allow right-rotation
of the contents of bits 32:63 of a register to be performed (in concept) by a left-rotation of 32-n, where n
is the number of bits by which to rotate right.
Architecture Note
For MD-form and MDS-form instructions, the MB and ME fields are used in permuted rather than
sequential order because this is easier for the processor. Permuting the MB field permits the processor
to obtain the low-order five bits of the MB value from the same place for all instructions having an MB
field (M-form and MD-form instructions). Permuting the ME field permits the processor to treat bits
21:26 of all MD-form instructions uniformly.
5.1 Overview
This chapter describes the registers and instructions that make up the floating-
point operations. Section 5.2 on page 69 describes the registers associated with
floating-point operations. Section 5.6 on page 98 describes the instructions asso-
ciated with floating-point operations.
• computational instructions
• non-computational instructions
There is one class of exceptional events that occur during instruction execution
that is unique to floating-point operations: the Floating-Point Exception. Floating-
point exceptions are signaled with bits set in the Floating-Point Status and Con-
trol Register (FPSCR). They can cause an Enabled exception type Program inter-
rupt to be taken, precisely or imprecisely, if the proper control bits are set.
Floating-Point Exceptions
SNaN (VXSNAN)
Infinity-Infinity (VXISI)
Infinity÷Infinity (VXIDI)
Zero÷Zero (VXZDZ)
Infinity×Zero (VXIMZ)
Invalid Compare (VXVC)
Software Request (VXSOFT)
Invalid Square Root (VXSQRT)
Invalid Integer Convert (VXCVI)
Each Floating-Point Register contains 64 bits that support the floating-point dou-
ble format. Every instruction that interprets the contents of a Floating-Point Reg-
ister as a floating-point value uses the floating-point double format for this
interpretation.
The computational instructions, and the Move and Select instructions, operate on
data located in Floating-Point Registers and, with the exception of the Compare
instructions, place the result value into a Floating-Point Register and optionally
place status information into the Condition Register.
Load and store double instructions are provided that transfer 64 bits of data
between storage and the Floating-Point Registers with no conversion. Load single
instructions are provided to transfer and convert floating-point values in floating-
point single format from storage to the same value in floating-point double format
in the Floating-Point Registers. Store single instructions are provided to transfer
and convert floating-point values in floating-point double format from the Float-
ing-Point Registers to the same value in floating-point single format in storage.
Instructions are provided that manipulate the Floating-Point Status and Control
Register and the Condition Register explicitly. Some of these instructions copy
data from a Floating-Point Register to the Floating-Point Status and Control Reg-
ister or vice versa.
The computational instructions and the Select instruction accept values from the
Floating-Point Registers in double format. For single-precision arithmetic instruc-
tions, all input values must be representable in single format; if they are not, the
result placed into the target Floating-Point Register, and the setting of status bits
in the Floating-Point Status and Control Register and in the Condition Register (if
Rc=1), are undefined.
The Floating-Point Status and Control Register (FPSCR) controls the handling of
floating-point exceptions and records status resulting from the floating-point
operations. Bits 32:55 are status bits. Bits 56:63 are control bits.
The exception bits in the Floating-Point Status and Control Register (bits 35:45,
53:55) are sticky; that is, once set to 1 they remain set to 1 until they are set to 0
by an mcrfs, mtfsfi, mtfsf, or mtfsb0 instruction. The exception summary bits
in the Floating-Point Status and Control Register (FX, FEX, and VX, which are
bits 32:34) are not considered to be ‘exception bits’, and only FX is sticky.
Bit(s) Description
32 Floating-Point Exception Summary (FX)
Every floating-point instruction, except mtfsfi and mtfsf, implicitly sets FPSCRFX to
1 if that instruction causes any of the floating-point exception bits in the Floating-
Point Status and Control Register to change from 0 to 1. mcrfs, mtfsfi, mtfsf,
mtfsb0, and mtfsb1 can alter FPSCRFX explicitly.
33 Floating-Point Enabled Exception Summary (FEX)
This bit is the OR of all the floating-point exception bits masked by their respective
enable bits. mcrfs, mtfsfi, mtfsf, mtfsb0, and mtfsb1 cannot alter FPSCRFEX ex-
plicitly.
34 Floating-Point Invalid Operation Exception Summary (VX)
This bit is the OR of all the Invalid Operation exception bits. mcrfs, mtfsfi, mtfsf,
mtfsb0, and mtfsb1 cannot alter FPSCRVX explicitly.
35 Floating-Point Overflow Exception (OX)
See Section 5.4.3 on page 89.
36 Floating-Point Underflow Exception (UX)
See Section 5.4.4 on page 91.
37 Floating-Point Zero Divide Exception (ZX)
See Section 5.4.2 on page 88.
38 Floating-Point Inexact Exception (XX)
See Section 5.4.5 on page 93.
FPSCRXX is a sticky version of FPSCRFI (see below). Thus the following rules com-
pletely describe how FPSCRXX is set by a given instruction.
• If the instruction affects FPSCRFI, the new value of FPSCRXX is obtained by ORing
the old value of FPSCRXX with the new value of FPSCRFI.
• If the instruction does not affect FPSCRFI, the value of FPSCRXX is unchanged.
39 Floating-Point Invalid Operation Exception (SNaN) (VXSNAN)
See Section 5.4.1 on page 85.
40 Floating-Point Invalid Operation Exception (∞-∞) (VXISI)
See Section 5.4.1 on page 85.
41 Floating-Point Invalid Operation Exception (∞÷∞) (VXIDI)
See Section 5.4.1 on page 85.
42 Floating-Point Invalid Operation Exception (0÷0) (VXZDZ)
See Section 5.4.1 on page 85.
43 Floating-Point Invalid Operation Exception (∞×0) (VXIMZ)
See Section 5.4.1 on page 85.
44 Floating-Point Invalid Operation Exception (Invalid Compare) (VXVC)
See Section 5.4.1 on page 85.
45 Floating-Point Fraction Rounded (FR)
The last Arithmetic or Rounding and Conversion instruction incremented the fraction
during rounding. See Section 5.3.6 on page 79. This bit is not sticky.
46 Floating-Point Fraction Inexact (FI)
The last Arithmetic or Rounding and Conversion instruction either produced an inex-
act result during rounding or caused a disabled Overflow Exception. See
Section 5.3.6 on page 79. This bit is not sticky.
See the definition of FPSCRXX, above, regarding the relationship between FPSCRFI
and FPSCRXX.
Architecture Note
Setting Floating-Point Non-IEEE Mode (NI) to 1 is intended to permit results to be
approximate, and to cause performance to be more predictable and less data-dependent
than when NI=0. For example, in Non-IEEE Mode an implementation returns 0 instead
of a denormalized number, and may return a large number instead of an infinity. In
Non-IEEE Mode an implementation should provide a means for ensuring that all results
are produced without software assistance (i.e., without causing an Enabled exception
type Program interrupt or a Floating-Point Unimplemented Instruction exception type
Program interrupt, and without invoking an ‘emulation assist’: see Chapter 7 on
page 143). The means may be controlled by one or more other Floating-Point Status and
Control Register bits (recall that the other Floating-Point Status and Control Register
bits have implementation-dependent meanings when NI=1).
The lengths of the exponent and the fraction fields differ between these two for-
mats. The structure of the single and double formats is shown below.
S EXP FRACTION
01 9 31
S EXP FRACTION
01 12 63
S sign bit
EXP exponent+bias
FRACTION fraction
Single Double
Exponent Bias +127 +1023
Maximum Exponent +127 +1023
Minimum Exponent –126 –1022
Widths (bits)
Format 32 64
Sign 1 1
Exponent 8 11
Fraction 23 52
Significand 24 53
The architecture requires that the Floating-Point Registers support the floating-
point double format only.
The NaNs are not related to the numeric values or infinities by order or value but
are encodings used to convey diagnostic information such as the representation of
uninitialized variables.
where s is the sign, E is the unbiased exponent, and 1.fraction is the significand,
which is composed of a leading unit bit (implied bit) and a fraction part.
Single Format:
1.2x10-38 ≤ M ≤ 3.4x1038
Double Format:
2.2x10-308 ≤ M ≤ 1.8x10308
where Emin is the minimum representable exponent value (-126 for single-preci-
sion, -1022 for double-precision).
Infinities (±∞)
These are values that have the maximum biased exponent value:
and a zero fraction value. They are used to approximate values greater in magni-
tude than the maximum normalized value.
Infinity arithmetic is defined as the limiting case of real arithmetic, with restricted
operations defined among numbers and infinities. Infinities and the real numbers
can be related by ordering in the affine sense:
Arithmetic on infinities is always exact and does not signal any exception, except
when an exception occurs due to the invalid operations as described in
Section 5.4.1 on page 85.
Quiet NaNs are used to represent the results of certain invalid operations, such as
invalid arithmetic operations on infinities or on NaNs, when Invalid Operation
Exception is disabled (FPSCRVE=0). Quiet NaNs propagate through all floating-
point operations except comparison, Floating Round to Single-Precision, and con-
version to integer. Quiet NaNs do not signal exceptions, except for ordered com-
parison and conversion to integer operations. Specific encodings in QNaNs can
thus be preserved through a sequence of floating-point operations, and used to
convey diagnostic information to help identify results from invalid operations.
When a QNaN is the result of a floating-point operation because one of the oper-
ands is a NaN or because a QNaN was generated due to a disabled Invalid Opera-
tion Exception, then the following rule is applied to determine the NaN with the
high-order fraction bit set to 1 that is to be stored as the result.
if FPR(FRA) is a NaN
then FPR(FRT) ← FPR(FRA)
else if FPR(FRB) is a NaN
then if instruction is frsp
then FPR(FRT) ← FPR(FRB)0:34 || 290
else FPR(FRT) ← FPR(FRB)
else if FPR(FRC) is a NaN
then FPR(FRT) ← FPR(FRC)
else if generated QNaN
then FPR(FRT) ← generated QNaN
If the operand specified by FRA is a NaN, then that NaN is stored as the result.
Otherwise, if the operand specified by FRB is a NaN (if the instruction specifies an
FRB operand), then that NaN is stored as the result, with the low-order 29 bits of
the result set to 0 if the instruction is frsp. Otherwise, if the operand specified by
FRC is a NaN (if the instruction specifies an FRC operand), then that NaN is
stored as the result. Otherwise, if a QNaN was generated due to a disabled Invalid
Operation Exception, then that QNaN is stored as the result. If a QNaN is to be
generated as a result, then the QNaN generated has a sign bit of 0, an exponent
field of all 1s, and a high-order fraction bit of 1 with all other fraction bits 0. Any
instruction that generates a QNaN as the result of a disabled Invalid Operation
must generate this QNaN (i.e., 0x7FF8_0000_0000_0000).
The following rules govern the sign of the result of an arithmetic, rounding, or
conversion operation, when the operation does not yield an exception. They apply
even when the operands or results are zeros or infinities.
• The sign of the result of an add operation is the sign of the operand having
the larger absolute value. If both operands have the same sign, the sign of the
result of an add operation is the same as the sign of the operands. The sign of
the result of the subtract operation x-y is the same as the sign of the result of
the add operation x+(-y).
When the sum of two operands with opposite sign, or the difference of two
operands with the same sign, is exactly zero, the sign of the result is positive
• The sign of the result of a Square Root or Reciprocal Square Root Estimate
operation is always positive, except that the square root of -0 is -0 and the
reciprocal square root of -0 is -Infinity.
For the Multiply-Add instructions, the rules given above are applied first to the
multiply operation and then to the add or subtract operation (one of the inputs to
the add or subtract operation is the result of the multiply operation).
Engineering Note
When denormalized numbers are operands of multiply, divide, and square root opera-
tions, some implementations may prenormalize the operands internally before
performing the operations.
All computational, Move, and Select instructions use the floating-point double for-
mat.
All input values must be representable in single format; if they are not, the
result placed into the target Floating-Point Register, and the setting of status
bits in the Floating-Point Status and Control Register and in the Condition
Register (if Rc=1), are undefined.
Programming Note
The Floating Round to Single-Precision instruction is provided to allow value conversion
from double-precision to single-precision with appropriate exception checking and
rounding. This instruction should be used to convert double-precision floating-point val-
ues (produced by double-precision load and arithmetic instructions and by fcfid) to
single-precision values prior to storing them into single format storage elements or using
them as operands for single-precision arithmetic instructions. Values produced by sin-
gle-precision load and arithmetic instructions are already single-precision values and
can be stored directly into single format storage elements, or used directly as operands
for single-precision arithmetic instructions, without preceding the store, or the arith-
metic instruction, by a Floating Round to Single-Precision instruction.
Programming Note
A single-precision value can be used in double-precision arithmetic operations. The
reverse is true only if the double-precision value is representable in single format.
5.3.6 Rounding
The material in this section applies to operations that have numeric operands
(i.e., operands that are not infinities or NaNs). Rounding the intermediate result of
such an operation may cause an Overflow Exception, an Underflow Exception, or
an Inexact Exception. The remainder of this section assumes that the operation
causes no exceptions and that the result is numeric. See Section 5.3.2 on page 74
and Section 5.4 on page 81 for the cases not covered here.
The instructions that round their intermediate result are the Arithmetic and
Rounding and Conversion instructions. Each of these instructions sets Floating-
Point Status and Control Register bits FR and FI. If the fraction was incremented
during rounding then FR is set to 1, otherwise FR is set to 0. If the rounded result
is inexact then FI is set to 1, otherwise FI is set to 0.
The two Estimate instructions set FR and FI to undefined values. The remaining
floating-point instructions do not alter FR and FI.
Figure 5-4 shows the relation of Z, Z1, and Z2 in this case. The following rules
specify the rounding in the four modes. ‘lsb’ means ‘least-significant bit’.
By Incrementing lsb of Z
Infinitely Precise Value
By Truncating after lsb
Z2 Z Z1 0 Z2 Z Z1
Negative values Positive values
Round to Nearest
Choose the value that is closer to Z (Z1 or Z2). In case of a tie, choose the one
that is even (least significant bit 0).
SNaN
Infinity-Infinity
Infinity÷Infinity
Zero÷Zero
Infinity×Zero
Invalid Compare
Software Request
Invalid Square Root
Invalid Integer Convert
A single instruction, other than mtfsfi or mtfsf, may set more than one exception
bit only in the following cases:
• Invalid Operation Exception (SNaN) may be set with Invalid Operation Excep-
tion (Invalid Compare) for Compare Ordered instructions.
• Invalid Operation Exception (SNaN) may be set with Invalid Operation Excep-
tion (Invalid Integer Convert) for Convert To Integer instructions.
For the remaining kinds of exception, a result is generated and written to the des-
tination specified by the instruction causing the exception. The result may be a
different value for the enabled and disabled conditions for some of these excep-
tions. The kinds of exception that deliver a result are the following:
Subsequent sections define each of the floating-point exceptions and specify the
action that is taken when they are detected.
The IEEE default behavior when an exception occurs is to generate a default value
and not to notify software. In this architecture, if the IEEE default behavior when
an exception occurs is desired for all exceptions, all Floating-Point Status and
Control Register exception enable bits should be set to 0 and Ignore Exceptions
Mode (see below) should be used. In this case the Enabled exception type Program
interrupt is not taken, even if floating-point exceptions occur: software can
inspect the Floating-Point Status and Control Register exception bits if necessary,
to determine whether exceptions have occurred.
The FE0 and FE1 bits control whether and how an Enabled exception type Pro-
gram interrupt is taken if an enabled floating-point exception occurs. The location
of these bits and the requirements for altering them are described in Section 2.1.1
on page 39. (An Enabled exception type Program interrupt is never taken because
1 1 Precise Mode
An Enabled exception type Program interrupt is taken precisely at
the instruction that caused the enabled exception.
Architecture Note
The FE0 and FE1 bits of the Machine State Register are defined in Section 2.1.1 on
page 39 in a manner such that they can be changed dynamically and can easily be
treated as part of a process' state.
In all cases, the question of whether a floating-point result is stored, and what
value is stored, is governed by the Floating-Point Status and Control Register
exception enable bits, as described in subsequent sections, and is not affected by
the value of the FE0 and FE1 bits.
In all cases in which an Enabled exception type Program interrupt is taken, all
instructions before the instruction at which the Enabled exception type Program
interrupt is taken have completed, and no instruction after the instruction at
which the Enabled exception type Program interrupt is taken has begun execu-
tion. (Recall that, for the two Imprecise modes, the instruction at which the
Enabled exception type Program interrupt is taken need not be the instruction
that caused the exception.) The instruction at which the Enabled exception type
Program interrupt is taken has not been executed unless it is the excepting
Programming Note
In any of the three non-Precise modes, a Floating-Point Status and Control Register
instruction can be used to force any exceptions, due to instructions initiated before the
Floating-Point Status and Control Register instruction, to be recorded in the Floating-
Point Status and Control Register. (This forcing is superfluous for Precise Mode.)
In either of the Imprecise modes, a Floating-Point Status and Control Register instruction
can be used to force any invocations of the Enabled exception type Program interrupt,
due to instructions initiated before the Floating-Point Status and Control Register instruc-
tion, to occur. (This forcing has no effect in Ignore Exceptions Mode, and is superfluous
for Precise Mode.)
In order to obtain the best performance across the widest range of implementa-
tions, the programmer should obey the following guidelines.
• If the IEEE default results are acceptable to the application, Ignore Excep-
tions Mode should be used with all Floating-Point Status and Control Register
exception enable bits set to 0.
• If the IEEE default results are not acceptable to the application, Imprecise
Nonrecoverable Mode should be used, or Imprecise Recoverable Mode if recov-
erability is needed, with Floating-Point Status and Control Register exception
enable bits set to 1 for those exceptions for which the Enabled exception type
Program interrupt is to be taken.
• Ignore Exceptions Mode should not, in general, be used when any Floating-
Point Status and Control Register exception enable bits are set to 1.
Engineering Note
It is permissible for the implementation to be precise in any of the three modes that per-
mit interrupts, or to be recoverable in Nonrecoverable Mode.
Definition
An Invalid Operation Exception occurs when an operand is invalid for the speci-
fied operation. The invalid operations are:
Programming Note
The purpose of FPSCRVXSOFT is to allow software to cause an Invalid Operation Excep-
tion for a condition that is not necessarily associated with the execution of a floating-
point instruction. For example, it might be set by a program that computes a square
root, if the source operand is negative.
Action
The action to be taken depends on the setting of the Invalid Operation Exception
Enable bit of the Floating-Point Status and Control Register.
FPSCRFPRF is undefined
FPR(FRT)0:31 ← undefined
FPSCRFPRF is undefined
Definition
A Zero Divide Exception occurs when a Divide instruction is executed with a zero
divisor value and a finite nonzero dividend value. It also occurs when a Reciprocal
Estimate instruction (fres or frsqrte) is executed with an operand value of zero.
Architecture Note
The name is a misnomer used for historical reasons. The proper name for this exception
should be ‘Exact Infinite Result from Finite Operands’ corresponding to what mathema-
ticians call a ‘pole’.
Action
The action to be taken depends on the setting of the Zero Divide Exception Enable
bit of the Floating-Point Status and Control Register.
When Zero Divide Exception is enabled (FPSCRZE=1) and Zero Divide occurs, the
following actions are taken:
FPSCRZX ← 1
4. FPSCRFPRF is unchanged
When Zero Divide Exception is disabled (FPSCRZE=0) and Zero Divide occurs, the
following actions are taken:
FPSCRZX ← 1
2. The target Floating-Point Register is set to ±Infinity, where the sign is deter-
mined by the XOR of the signs of the operands
4. FPSCRFPRF is set to indicate the class and sign of the result (±Infinity)
Definition
Overflow occurs when the magnitude of what would have been the rounded result
if the exponent range were unbounded exceeds that of the largest finite number of
the specified result precision.
Action
The action to be taken depends on the setting of the Overflow Exception Enable
bit of the Floating-Point Status and Control Register.
FPSCROX ← 1
4. The adjusted rounded result is placed into the target Floating-Point Register
5. FPSCRFPRF is set to indicate the class and sign of the result (±Normal
Number)
When Overflow Exception is disabled (FPSCROE=0) and overflow occurs, the fol-
lowing actions are taken:
FPSCROX ← 1
FPSCRXX ← 1
3. The result is determined by the rounding mode (FPSCRRN) and the sign of the
intermediate result as follows:
A. Round to Nearest
Store ± Infinity, where the sign is the sign of the intermediate result
5. FPSCRFR is undefined
6. FPSCRFI is set to 1
7. FPSCRFPRF is set to indicate the class and sign of the result (±Infinity or ±Nor-
mal Number)
Definition
Underflow Exception is defined separately for the enabled and disabled states:
• Enabled:
Underflow occurs when the intermediate result is ‘Tiny’.
• Disabled:
Underflow occurs when the intermediate result is ‘Tiny’ and there is ‘Loss of
Accuracy’.
‘Loss of Accuracy’ is detected when the delivered result value differs from what
would have been computed were both the precision and the exponent range
unbounded.
Action
The action to be taken depends on the setting of the Underflow Exception Enable
bit of the Floating-Point Status and Control Register.
FPSCRUX ← 1
4. The adjusted rounded result is placed into the target Floating-Point Register
5. FPSCRFPRF is set to indicate the class and sign of the result (±Normalized
Number)
Programming Note
The FR and FI bits are provided to allow the Enabled exception type Program interrupt,
when taken because of an Underflow Exception, to simulate a ‘trap disabled’ environ-
ment. That is, the FR and FI bits allow the Enabled exception type Program interrupt to
unround the result, thus allowing the result to be denormalized.
FPSCRUX ← 1
3. FPSCRFPRF is set to indicate the class and sign of the result (±Normalized
Number, ±Denormalized Number, or ±Zero)
Definition
An Inexact Exception occurs when one of two conditions occur during rounding:
1. The rounded result differs from the intermediate result assuming both the
precision and the exponent range of the intermediate result to be unbounded.
In this case the result is said to be inexact. (If the rounding causes an enabled
Overflow Exception or an enabled Underflow Exception, an Inexact Exception
also occurs only if the significands of the rounded result and the intermediate
result differ.)
Action
The action to be taken does not depend on the setting of the Inexact Exception
Enable bit of the Floating-Point Status and Control Register.
FPSCRXX ← 1
Programming Note
In some implementations, enabling Inexact Exceptions may degrade performance more
than does enabling other types of floating-point exception.
All implementations of this architecture must provide the equivalent of the follow-
ing execution models to ensure that identical results are obtained.
Special rules are provided in the definition of the computational instructions for
the infinities, denormalized numbers and NaNs. The material in the remainder of
this section applies to instructions that have numeric operands and a numeric
result (i.e., operands and result that are not infinities or NaNs), and that cause no
exceptions. See Section 5.3.2 on page 74 and Section 5.4 on page 81 for the cases
not covered here.
The IEEE standard includes 32-bit and 64-bit arithmetic. The standard requires
that single-precision arithmetic be provided for single-precision operands. The
standard permits double-precision floating-point operations to have either (or
both) single-precision or double-precision operands, but states that single-preci-
sion floating-point operations should not accept double-precision operands.
Book E follows these guidelines: double-precision arithmetic instructions can
have operands of either or both precisions, while single-precision arithmetic
instructions require all operands to be single-precision. Double-precision arith-
metic instructions and fcfid produce double-precision values, while single-preci-
sion arithmetic instructions produce single-precision values.
S C L FRACTION G R X
0 1 52 55
The L bit is the leading unit bit of the significand, which receives the implicit bit
from the operand.
The FRACTION is a 52-bit field that accepts the fraction of the operand.
The Guard (G), Round (R), and Sticky (X) bits are extensions to the low-order bits
of the accumulator. The G and R bits are required for post-normalization of the
result. The G, R, and X bits are required during rounding to determine if the inter-
mediate result is equally near the two nearest representable values. The X bit
serves as an extension to the G and R bits by representing the logical OR of all
bits that may appear to the low-order side of the R bit, due either to shifting the
accumulator right or to other generation of low-order result bits. The G and R bits
participate in the left shifts with zeros being shifted into the R bit. Table 5-4
shows the significance of the G, R, and X bits with respect to the intermediate
result (IR), the representable number next lower in magnitude (NL), and the repre-
sentable number next higher in magnitude (NH).
G R X Interpretation
0 0 0 IR is exact
0 0 1
0 1 0 IR closer to NL
0 1 1
1 0 0 IR midway between NL and NH
1 0 1
1 1 0 IR closer to NH
1 1 1
After normalization, the intermediate result is rounded, using the rounding mode
specified by FPSCRRN. If rounding results in a carry into C, the significand is
shifted right one position and the exponent incremented by one. This yields an
inexact result and possibly also exponent overflow. Fraction bits to the left of the
bit position used for rounding are stored into the Floating-Point Register and low-
order bit positions, if any, are set to zero.
Table 5-5. Location of the Guard, Round, and Sticky bits in the IEEE execution model
Rounding can be treated as though the significand were shifted right, if required,
until the least significant bit to be retained is in the low-order bit position of the
FRACTION. If any of the Guard, Round, or Sticky bits is nonzero, then the result
is inexact.
Z1 and Z2, as defined on page 80, can be used to approximate the result in the
target format when one of the following rules is used.
Guard bit = 0
The result is truncated. (Result exact (GRX = 000) or closest to next lower
value in magnitude (GRX = 001, 010, or 011))
Guard bit = 1
Depends on Round and Sticky bits:
Case a
If the Round or Sticky bit is 1 (inclusive), the result is incremented.
(Result closest to next higher value in magnitude (GRX = 101, 110, or
111))
Case b
If the Round and Sticky bits are 0 (result midway between closest rep-
resentable values), then if the low-order bit of the result is 1 the result
is incremented. Otherwise (the low-order bit of the result is 0) the
result is truncated (this is the case of a tie rounded to even).
Where the result is to have fewer than 53 bits of precision because the instruction
is a Floating Round to Single-Precision or single-precision arithmetic instruction,
the intermediate result is either normalized or placed in correct denormalized
form before being rounded.
S C L FRACTION X’
0 1 105106
The first part of the operation is a multiplication. The multiplication has two 53-
bit significands as inputs, which are assumed to be prenormalized, and produces
a result conforming to the above model. If there is a carry out of the significand
(into the C bit), then the significand is shifted right one position, shifting the L bit
The result of the addition is then normalized, with all bits of the addition result,
except the X' bit, participating in the shift. The normalized result serves as the
intermediate result that is input to the rounder.
For rounding, the conceptual Guard, Round, and Sticky bits are defined in terms
of accumulator bits. Table 5-6 shows the positions of the Guard, Round, and
Sticky bits for double-precision and single-precision floating-point numbers in the
multiply-add execution model.
Table 5-6. Location of the Guard, Round, and Sticky bits in the multiply-add execution
model
The rules for rounding the intermediate result are the same as those given in
Section 5.5.1 on page 94.
Architecture Note
The rules followed in assigning new primary and extended opcodes.
3. In assigning new extended opcodes for primary opcode 63, the following regularities
are maintained. In addition, all new X-form instructions in primary opcode 63 have
bits 21:22 = 0b11.
• Bits 26:29 = 0b0000 iff the instruction is a comparison or mcrfs (i.e., iff the
instruction sets an explicitly-designated CR field).
• Bits 26:28 = 0b001 iff the instruction explicitly refers to or sets the Floating-
Point Status and Control Register (i.e., is a Floating-Point Status and Control
Register instruction) and is not mcrfs.
• Bits 26:30 = 0b01000 iff the instruction is a Move Register instruction, or any
other instruction that does not refer to or set the Floating-Point Status and
Control Register.
4. In assigning extended opcodes for primary opcode 59, the following regularities have
been maintained. They are based on those rules for primary opcode 63 that apply to
the instructions having primary opcode 59. In particular, primary opcode 59 has no
Floating-Point Status and Control Register instructions, so the corresponding rule
does not apply.
• Bits 26:30 = 0b01000 iff the instruction is a Move Register instruction, or any
other instruction that does not refer to or set the Floating-Point Status and
Control Register.
There are two basic forms of load instruction: single-precision and double-preci-
sion. Because the FPRs support only floating-point double format, single-preci-
sion Load Floating-Point instructions convert single-precision data to double
format prior to loading the operand into the target Floating-Point Register. The
conversion and loading steps are as follows.
Denormalized Operand
if WORD1:8 = 0 and WORD9:31 ≠ 0 then
sign ← WORD0
exp ← -126
frac0:52 ← 0b0 || WORD9:31 || 290
normalize the operand
do while frac0 = 0
frac ← frac1:52 || 0b0
exp ← exp - 1
FPR(FRT)0 ← sign
FPR(FRT)1:11 ← exp + 1023
FPR(FRT)12:63 ← frac1:52
Engineering Note
The above description of the conversion steps is a model only. The actual implementa-
tion may vary from this but must produce results equivalent to what this model would
produce.
Floating-Point Load storage accesses will cause a Data Storage interrupt if the pro-
gram is not allowed to read the storage location. FLoating-Point Load storage
accesses will cause a Data TLB Error interrupt if the program attempts to access
storage that is unavailable.
Note: Recall that RA and RB denote General Purpose Registers, while FRT
denotes a Floating-Point Register.
Engineering Note
Implementations are strongly recommended to ignore bit 31 of instruction encodings for
X-form Floating-Point Load instructions.
Denormalization Required
if 874 ≤ FRS1:11 ≤ 896 then
sign ← FPR(FRS)0
exp ← FPR(FRS)1:11 – 1023
frac ← 0b1 || FPR(FRS)12:63
denormalize operand
do while exp < –126
frac ← 0b0 || frac0:62
exp ← exp + 1
WORD0 ← sign
WORD1:8 ← 0x00
WORD9:31 ← frac1:23
else WORD ← undefined
For double-precision Store Floating-Point instructions and for the Store Floating-
Point as Integer Word instruction no conversion is required, as the data from the
Floating-Point Register are copied directly into storage.
Floating-Point Store storage accesses will cause a Data Storage interrupt if the pro-
gram is not allowed to write to the storage location. Floating-Point Store storage
accesses will cause a Data TLB Error interrupt if the program attempts to access
storage that is unavailable.
Note: Recall that RA and RB denote General Purpose Registers, while FRS
denotes a Floating-Point Register.
Engineering Note
Implementations are strongly recommended to ignore bit 31 of instruction encodings for
X-form Floating-Point Store instructions.
These instructions copy data from one floating-point register to another, altering
the sign bit (bit 0) as described below for fneg, fabs, and fnabs. These instruc-
tions treat NaNs just like any other kind of value (e.g., the sign bit of a NaN may
be altered by fneg, fabs, and fnabs). These instructions do not alter the Floating-
Point Status and Control Register.
• Overflow, Underflow, and Inexact Exception bits, the FR and FI bits, and the
FPRF field are set based on the final result of the operation, and not on the
result of the multiplication.
• Invalid Operation Exception bits are set as if the multiplication and the addi-
tion were performed using two separate instructions (fmul[s], followed by
fadd[s] or fsub[s]). That is, multiplication of infinity by 0 or of anything by an
SNaN, and/or addition of an SNaN, cause the corresponding exception bits to
be set.
Programming Note
Examples of uses of these instructions to perform various conversions can be found in
Section C.3 on page 389.
The comparison sets one bit in the designated CR field to 1 and the other three to
0. The FPCC is set in the same way.
Bit NameDescription
0 FL(FRA) < (FRB)
1 FG(FRA) > (FRB)
2 FE(FRA) = (FRB)
3 FU(FRA) ? (FRB) (unordered)
• All exceptions that will be caused by the previously initiated instructions are
recorded in the Floating-Point Status and Control Register before the Floating-
Point Status and Control Register instruction is initiated.
• All invocations of the Enabled exception type Program interrupt that will be
caused by the previously initiated instructions have occurred before the Float-
ing-Point Status and Control Register instruction is initiated.
6.1.1 Introduction
Section 1.11 defines storage as a linear array of bytes indexed from 0 to a maxi-
mum of 264 – 1. Each byte is identified by its index, called its address, and each
byte contains a value. This information is sufficient to allow the programming of
applications that require no special features of any particular system environ-
ment. This chapter expands this simple storage model to include caches, virtual
storage, and shared storage multiprocessors, and in conjunction with services
provided by the operating system, describes a mechanism that permits explicit
control of this expanded storage model. A simple model for sequential execution
allows at most one storage access to be performed at a time and requires that all
storage accesses appear to be performed in program order. In contrast to this sim-
ple model, Book E specifies a relaxed model of memory consistency. In a multipro-
cessor system that allows multiple copies of a location, aggressive
implementations of Book E can permit intervals of time during which different
copies of a location have different values. This chapter describes features of
Book E that enable programmers to write correct programs for this memory
model.
A program references storage using the effective address computed by the proces-
sor when it executes a load, store, branch, or cache management instruction, and
when it fetches the next sequential instruction. The effective address is translated
to a real address according to procedures described in Section 6.2.2 and Section
6.2.3. The real address is sent to the memory subsystem to perform the storage
access (see Figure 6-2 on page 128).
Each program can access 264 bytes of ‘effective address’ (EA) space, subject to
limitations imposed by the operating system. In a typical Book E system, each
program's EA space is a subset of a larger ‘virtual address’ (VA) space managed by
the operating system.
In general, real storage may not be large enough to map all the virtual pages used
by the currently active applications. With support provided by hardware, the oper-
ating system can attempt to use the available real pages to map a sufficient set of
The operating system can support restricted access to virtual pages (including
individual enables for user state read, write, and execute, and supervisor state
read, write, and execute: see Section 6.2.4), based on system standards (e.g. pro-
gram code might be execute only, data structures mapped as read/write/no exe-
cute) and application requests.
Instructions are fetched using the address translated by the TLB mechanism.
Instructions are not fetched from no-execute storage (UX=0 or SX=0, see Section
6.2.4.1). If the effective address of the current instruction is mapped to no-execute
storage, an Instruction Storage interrupt is generated.
In Book E the following single-register accesses (i.e. aligned scalar accesses less
than or equal to the implemented width of the storage interface) are always
atomic:
No other accesses are guaranteed to be atomic. For example, the access caused by
the following instructions is not guaranteed to be atomic.
The results for several combinations of loads and stores to the same or overlap-
ping locations are described below.
• When two processors execute atomic stores to locations that do not overlap,
and no other stores are performed to those locations, the contents of those
locations are the same as if the two stores were performed by a single
processor.
• When two processors execute atomic stores to the same storage location, and
no other store is performed to that location, the contents of that location are
the result stored by one of the processors.
• When two processors execute stores that have the same target location and
are not guaranteed to be atomic, and no other store is performed to that loca-
tion, the result is some combination of the bytes stored by both processors.
Engineering Note
Atomicity of storage accesses is provided by the processor in conjunction with the stor-
age subsystem. The processor must provide a storage subsystem interface that is
sufficient to allow a storage subsystem to meet the atomicity requirements specified
here.
A cache model in which there is one cache for instructions and another cache for
data is called a ‘Harvard-style’ cache. This is the model assumed by Book E, e.g.,
in the descriptions of the Cache Management instructions in Section 6.3.2. Alter-
native cache models may be implemented (e.g., a ‘combined cache’ model, in
which a single cache is used for both instructions and data, or a model in which
there are several levels of caches), but they must support the programming model
implied by a Harvard-style cache.
Cache Management instructions are provided so that programs can manage the
caches when needed. For example, program management of the caches is needed
when a program generates or modifies code that will be executed (i.e., when the
program modifies data in storage and then attempts to execute the modified data
as instructions). The Cache Management instructions are also useful in optimizing
the use of memory bandwidth in such applications as graphics and numerically
intensive computing. The functions performed by these instructions depend on
the storage attributes associated with the specified storage location (see Section
6.2.5).
• give a hint that a block of storage should be copied to the instruction cache,
so that the copy of the block is more likely to be in the cache when subse-
quent accesses to the block occur, thereby reducing delays (icbt[e])
• give a hint that a block of storage should be copied to the data cache, so that
the copy of the block is more likely to be in the cache when subsequent
accesses to the block occur, thereby reducing delays (dcbt[e], dcbtst[e])
• copy the contents of a modified data cache block to main storage (dcbst[e])
• copy the contents of a modified data cache block to main storage and make
the copy of the block in the data cache invalid (dcbf[e]).
Architecture Note
In earlier versions of the architecture specification, ‘speculative’ was used instead of
‘out-of-order’. The terminology was changed to be consistent with the technical litera-
ture, where ‘speculative execution’ often means the execution of instructions past
unresolved branches and ‘out-of-order execution’ means execution of an instruction
before it is known to be required by the sequential execution model. Because the mean-
ing of ‘speculative’ in the literature differs from ordinary English usage the term would
cause confusion no matter how the architecture specification defined it, so the term is
no longer used here at all.
• Stores
No error of any kind other than Machine Check may be reported due to an opera-
tion that is performed out-of-order, until such time as it is known that the opera-
Engineering Note
Out-of-order execution of the Storage Synchronization instructions lwarx[e], ldarxe,
stwcx[e]., and stdcxe. is extremely complex and is not recommended.
Engineering Note
Because an asynchronous exception can become pending at any time, it might seem
that, for example, if MSREE=1 then fetching or executing any instruction beyond the
current instruction is an out-of-order operation. However, these operations need not be
treated as out-of-order if the taking of the interrupt is delayed until after they have com-
pleted. Similar considerations apply to Floating-Point Enabled Exception type Program
interrupts when one of the Imprecise floating-point exception modes is in effect.
Engineering Note
Implementations that perform operations out-of-order must take care to obey the
sequential execution model except as permitted by Book E. Examples of cases that may
require special attention include the following.
• changes of control flow, including sc, Trap, rfi, rfci, and interrupts as well as
branches
• changes of context due to changes of control flow. For example, the code at a branch
target location, or the handler for System Call or Trap interrupts, may change the
context and then return, so that the instructions immediately following the Branch,
sc, or Trap execute in a new context
• changes to resources, including but not limited to MSRPR IS DS and TLB entries,
that affect address translation, access control, or storage control attributes, when
the change is followed by the appropriate software synchronization
• Load Instruction
If a copy of the location is in a cache then the location may be accessed in the
cache or in main storage.
• Instruction Fetch
Programming Note
Software should mark guarded space as no-execute (UX=0 and SX=0) to prevent
inadvertent instruction fetch from guarded areas of storage.
When the same storage location has different effective addresses, the addresses
are said to be aliases. Each application can be granted separate access privileges
to aliased pages.
Engineering Note
Page-level aliasing can be implemented in many ways, such as with real-addressed
caches, L2 directories, or an external signal to an inverse directory. Each processor
implementation will decide on its level of implementation in support of its system
requirements.
The order in which the processor performs storage accesses, the order in which
those accesses are performed with respect to another processor or mechanism,
and the order in which those accesses are performed in main storage may all be
different. Several means of enforcing an ordering of storage accesses are provided
to allow programs to share storage with other programs, or with mechanisms
such as I/O devices. These means are listed below. The phrase ‘to the extent
required by the associated Memory Coherence Required attributes’ refers to the
Memory Coherence Required attribute, if any, associated with each access.
• If two Store instructions specify storage locations that are both Caching Inhib-
ited and Guarded, the corresponding storage accesses are performed in
program order with respect to any processor or mechanism.
The memory barrier created by msync is cumulative, and applies to all stor-
age accesses except those associated with fetching instructions following the
msync instruction. See the definition of mbar on page 304 for a description of
the corresponding properties of the memory barrier created by that
instruction.
Programming Note
The first example below illustrates cumulative ordering of storage accesses preceding a
memory barrier, and the second illustrates cumulative ordering of storage accesses fol-
lowing a memory barrier. Assume that locations X, Y, and Z initially contain the value 0.
Example 1:
Example 2:
Processor B: loops loading from location Y until the value 2 is obtained, then stores
the value 3 to location Z
In both cases, cumulative ordering dictates that the value loaded from location X by pro-
cessor C is 1.
Engineering Note
It is permissible to perform a dependent load before the load on which it depends, if soft-
ware accessing shared storage cannot tell the difference.
It is always permissible to prefetch a data cache block from non-Guarded storage based
on predicting the effective address specified by a Load or Store instruction.
Because an isync instruction prevents the execution of instructions following the isync
until instructions preceding the isync have completed, if an isync follows a conditional
Branch instruction that depends on the value returned by a preceding Load instruction,
the load on which the Branch depends is performed before any loads caused by instruc-
tions following the isync. This applies even if the effects of the ‘dependency’ are
independent of the value loaded (e.g., the value is compared to itself and the Branch
tests the EQ bit in the selected CR field), and even if the branch target is the next
sequential instruction to be executed.
With the exception of the cases described above and earlier in this section, data depen-
dencies and control dependencies do not order storage accesses. Examples include the
following.
• If a Load instruction specifies the same storage location as a preceding Store instruc-
tion and the location is in storage that is not Caching Inhibited, the load may be
satisfied from a ‘store queue’ (a buffer into which the processor places stored values
before presenting them to the storage subsystem), and not be visible to other proces-
sors and mechanisms. A consequence is that if a subsequent Store depends on the
value returned by the Load, the two stores need not be performed in program order
with respect to other processors and mechanisms.
• Because a Store Conditional instruction may complete before its store has been per-
formed, a conditional Branch instruction that depends on the CR0 value set by a
Store Conditional instruction does not order the Store Conditional's store with respect
to storage accesses caused by instructions that follow the Branch.
• Because processors may predict branch target addresses and branch condition reso-
lution, control dependencies (e.g., branches) do not order storage accesses except as
described above. For example, when a subroutine returns to its caller the return
address may be predicted, with the result that loads caused by instructions at or
after the return address may be performed before the load that obtains the return
address is performed.
Examples of correct uses of dependencies, msync, and mbar to order storage accesses
can be found in Appendix D on page 397.
Because the storage model is weakly consistent, the sequential execution model as
applied to instructions that cause storage accesses guarantees only that those accesses
appear to be performed in program order with respect to the processor executing the
instructions. For example, an instruction may complete, and subsequent instructions
may be executed, before storage accesses caused by the first instruction have been per-
formed. However, for a sequence of atomic accesses to the same storage location, if the
location is in Memory Coherence Required storage, the definition of coherence guaran-
tees that the accesses are performed in program order with respect to any processor or
mechanism that accesses the location coherently. The same applies if the location is in
Caching Inhibited storage.
Because accesses to storage that is Caching Inhibited are performed in main storage,
memory barriers and dependencies on Load instructions order such accesses with
respect to any processor or mechanism even if the storage is not Memory Coherence
Required.
The definition of memory barriers is not intended to preclude address pipelining. If two
applicable Storage Access instructions are separated by msync or mbar, it is permissi-
ble for the address associated with the second instruction to be presented to a given
level of the storage hierarchy before the data access caused by the first instruction has
completed at that level. However, if such pipelining is done, the processor must provide
sufficient information so that the storage subsystem can keep the storage accesses in
the correct order, and the storage subsystem must do so.
Programming Note
The Memory Coherence Required attribute on other processors and mechanisms
ensures that their stores to the specified storage location will cause the reservation cre-
ated by the lwarx, lwarxe, or ldarxe to be lost.
Programming Note
Warning: Support for Load and Reserve and Store Conditional instructions for which the
specified storage location is in storage that is Caching Inhibited is being phased out of
Book E. It is likely not to be provided on future implementations. New programs should
not use these instructions to access Caching Inhibited storage.
Engineering Note
For a given implementation, decisions regarding whether to support Load and Reserve
and Store Conditional instructions that specify a Caching Inhibited storage location, and
how well to make such instructions perform, must include consideration of migration
plans for existing software that uses these instructions in this manner.
The lwarx instruction is a load from a word-aligned location that has two side
effects.
• The storage coherence mechanism is notified that a reservation exists for the
storage location accessed by the lwarx.
A stwcx. performs a store to the target storage location only if the storage location
accessed by the lwarx that established the reservation has not been stored into
by another processor or mechanism between supplying a value for the lwarx and
storing the value supplied by the stwcx. If the storage locations specified by the
two instructions differ the store is not necessarily performed. CR0 is set to indi-
cate whether the store was performed.
If a stwcx. completes but does not perform the store because a reservation no
longer exists, CR0 is set to indicate that the stwcx. completed but storage was not
altered.
Examples of the use of lwarx and stwcx. are given in Appendix C on page 379.
A successful stwcx. to a given location may complete before its store has been
performed with respect to other processors and mechanisms. As a result, a sub-
sequent load or lwarx from the given location on another processor may return a
‘stale’ value. However, a subsequent lwarx from the given location on the other
processor followed by a successful stwcx. on that processor is guaranteed to have
returned the value stored by the first processor’s stwcx. (in the absence of other
stores to the given location).
Reservations
The ability to emulate an atomic operation using lwarx and stwcx. is based on
the conditional behavior of stwcx., the reservation set by lwarx, and the clearing
of that reservation if the target location is modified by another processor or mech-
anism before the stwcx. performs its store.
Programming Note
One use of lwarx and stwcx. is to emulate a ‘Compare and Swap’ primitive like that
provided by the IBM System/370 Compare and Swap instruction: see Appendix C on
page 379. A System/370-style Compare and Swap checks only that the old and current
values of the word being tested are equal, with the result that programs that use such a
Compare and Swap to control a shared resource can err if the word has been modified
and the old value subsequently restored. The combination of lwarx and stwcx.
improves on such a Compare and Swap, because the reservation reliably binds the
lwarx and stwcx. together. The reservation is always lost if the word is modified by
another processor or mechanism between the lwarx and stwcx., so the stwcx. never
succeeds unless the word has not been stored into (by another processor or mecha-
nism) since the lwarx.
• Some other processor executes a dcba[e] to the same reservation granule: the
reservation is lost if the instruction causes the target block to be newly estab-
lished in the data cache or to be modified; otherwise whether the reservation
is lost is undefined.
Interrupts (see Chapter 7 on page 143) do not clear reservations (however, system
software invoked by interrupts may clear reservations).
Programming Note
In general, programming conventions must ensure that lwarx and stwcx. specify
addresses that match; a stwcx. should be paired with a specific lwarx to the same stor-
age location. Situations in which a stwcx. may erroneously be issued after some lwarx
other than that with which it is intended to be paired must be scrupulously avoided. For
example, there must not be a context switch in which the processor holds a reservation
in behalf of the old context, and the new context resumes after a lwarx and before the
paired stwcx.. The stwcx. in the new context might succeed, which is not what was
intended by the programmer.
Engineering Note
Reservations must take part in storage coherence. A reservation must be cleared if
another processor receives authorization from the coherence mechanism to store to the
reservation granule.
If an implementation continues to hold a reservation when the cache block in which the
reservation lies is evicted, the reservation must continue to participate in the coherence
protocol. In a snooping implementation, it must join in snooping. In a directory-based
implementation, it must register its interest in the reserved block with the directory
(shared-read access).
If an implementation demands that the reserved block be held in the cache, one way to
satisfy the architectural requirements is the following. The implementation must be able
to protect that block from eviction except by explicit invalidation (e.g., execution of
dcbf[e]) by the processor holding the reservation, and by cross-invalidates received from
other processors, as long as the reservation persists. Caches in such an implementation
must be sufficiently associative that the machine can continue to run with eviction of
the reserved block inhibited.
Forward progress in loops that use lwarx and stwcx. is achieved by a cooperative
effort among hardware, operating system software, and application software.
2. the stwcx. fails because some other processor or mechanism modified loca-
tion X, or
3. the stwcx. fails because the processor's reservation was lost for some other
reason.
In Cases 1 and 2, the system as a whole makes progress in the sense that some
processor successfully modifies location X. Case 3 covers reservation loss
required for correct operation of the rest of the system. This includes cancellation
caused by some other processor writing elsewhere in the reservation granule for
X, as well as cancellation caused by the operating system in managing certain
limited resources such as real memory. It may also include implementation-
dependent causes of reservation loss.
Architecture Note
Book E does not include a ‘fairness guarantee.’ In competing for a reservation, two pro-
cessors can indefinitely lock out a third.
Lock words should be allocated such that contention for the locks and updates to
nearby data structures do not cause excessive reservation losses due to false indi-
cations of sharing that can occur due to the reservation granularity.
A processor holding a reservation on any word in a reservation granule will lose its
reservation if some other processor stores anywhere in that granule. Such prob-
lems can be avoided only by ensuring that few such stores occur. This can most
easily be accomplished by allocating an entire granule for a lock and wasting all
but one word.
Reservation granularity may vary for each implementation. There are no architec-
tural restrictions bounding the granularity implementations must support, so
reasonably portable code must dynamically allocate aligned and padded storage
for locks to guarantee absence of granularity-induced reservation loss.
This section describes the address translation facility, access control, and storage
attributes and control for Book E storage.
Book E divides the effective address space into pages. The page represents the
granularity of effective address translation, access control, and storage attributes.
Up to sixteen page sizes (1KB, 4KB, 16KB, 64KB, 256KB, 1MB, 4MB, 16MB,
64MB, 256MB, 1GB, 4GB, 16GB, 64GB, 256GB, 1TB) may be simultaneously
supported. In order for an effective to real translation to exist, a valid entry for the
page containing the effective address must be in the Translation Lookaside Buffer
(TLB). Addresses for which no TLB entry exists cause TLB Miss exceptions.
In addition to the registers described below, the Machine State Register provides
the IS and DS bits, that specify which of the two address spaces the respective
instruction or data storage accesses are directed towards. MSRPR bit is also used
by the Book E storage access control mechanism.
The contents of bits 32:63 of the Process ID Register can be read into bits 32:63 of
GPR(RT) using mfspr RT,PID, setting bits 0:31 of GPR(RT) to 0. The contents of
bits 32:63 of GPR(RS) can be written into the Process ID Register using
mtspr RS,PID. An implementation may opt to implement only the least-significant
n bits of the Process ID Register, where 0 ≤ n ≤ 32, and n must be the same as the
number of implemented bits in the TID field of the TLB entry. The most-significant
32–n bits of the Process ID Register are treated as reserved. See the User’s Manual
for the implementation.
Some implementations may support more than one Process ID Register. See
User’s Manual for the implementation.
While the TLB is managed by software, Book E does not prohibit an implementa-
tion from implementing partial or full hardware assist for TLB management (e.g.
support of PowerPC Architecture’s virtual memory architecture). However, such
implementations should be able to disable such support with implementation-
dependent software or hardware configuration mechanisms.
Each TLB entry describes a page that is eligible for translation and access con-
trols. Fields in the TLB entry fall into four categories:
While Book E requires the fields prescribed in Tables 6-1, 6-2, 6-3, and 6-4 to be
implemented, no particular TLB entry format is formally specified. Book E does
provide the ability to read or write portions of individual entries using the tlbre
and tlbwe instructions.
Field Description
V Valid (1 bit)
This bit indicates that this TLB entry is valid and may be used for translation.
The Valid bit for a given entry can be set or cleared with a tlbwe instruction; al-
ternatively, the Valid bit for an entry may be cleared by a tlbivax[e] instruction.
EPN Effective Page Number (54 bits)
Bits 0:n–1 of the EPN field are compared to bits 0:n–1 of the effective address (EA)
of the storage access (where n=64–log2(page size in bytes) and page size is spec-
ified by the SIZE field of the TLB entry). See Table 6-5.
Note
Implementations may implement bits N:53 of the EPN field, where N≥0. See Us-
er’s Manual for the implementation.
Field Description
RPN Real Page Number (up to 54 bits)
Bits 0:n–1 of the RPN field are used to replace bits 0:n–1 of the effective address
to produce the real address for the storage access (where n=64–
log2(page size in bytes) and page size is specified by the SIZE field of the TLB en-
try). Software must set unused low-order RPN bits (i.e. bits n:53) to 0. See Sec-
tion 6.2.3.
Note
Implementations may implement bits M:53 of the RPN field, where M≥0. See Us-
er’s Manual for the implementation.
Bit Description
UX User State Execute Enable (1 bit) See Section 6.2.4.1.
=0 Instruction fetch and execution is not permitted from this page while
MSRPR=1 and will cause an Execute Access Control exception type
Instruction Storage interrupt.
=1 Instruction fetch and execution is permitted from this page while
MSRPR=1.
SX Supervisor State Execute Enable (1 bit) See Section 6.2.4.1.
=0 Instruction fetch and execution is not permitted from this page while
MSRPR=0 and will cause an Execute Access Control exception type
Instruction Storage interrupt.
=1 Instruction fetch and execution is permitted from this page while
MSRPR=0.
UW User State Write Enable (1 bit) See Section 6.2.4.2.
=0 Store operations, including dcba[e] and dcbz[e], are not permitted to
this page when MSRPR=1 and will cause a Write Access Control ex-
ception. Except as noted in Table 6-7 on page 131, a Write Access
Control exception will cause a Data Storage interrupt.
=1 Store operations, including dcba[e] and dcbz[e], are permitted to this
page when MSRPR=1.
SW Supervisor State Write Enable (1 bit) See Section 6.2.4.2.
=0 Store operations, including dcba[e], dcbi[e], and dcbz[e], are not per-
mitted to this page when MSRPR=0. Store operations, including dc-
bi[e] and dcbz[e], will cause a Write Access Control exception. Except
as noted in Table 6-7 on page 131, a Write Access Control exception
will cause a Data Storage interrupt.
=1 Store operations, including dcba[e], dcbi[e], and dcbz[e], are permit-
ted to this page when MSRPR=0.
UR User State Read Enable (1 bit) See Section 6.2.4.3.
=0 Load operations (including load-class Cache Management instruc-
tions) are not permitted from this page when MSRPR=1 and will cause
a Read Access Control exception. Except as noted in Table 6-7 on
page 131, a Read Access Control exception will cause a Data Storage
interrupt.
=1 Load operations (including load-class Cache Management instruc-
tions) are permitted from this page when MSRPR=1.
SR Supervisor State Read Enable (1 bit) See Section 6.2.4.3.
=0 Load operations (including load-class Cache Management instruc-
tions) are not permitted from this page when MSRPR=0 and will cause
a Read Access Control exception. Except as noted in Table 6-7 on
page 131, a Read Access Control exception will cause a Data Storage
interrupt.
=1 Load operations (including load-class Cache Management instruc-
tions) are permitted from this page when MSRPR=0.
Bit(s) Description
W Write-Through Required (1 bit) See Section 6.2.5.1.
=0 The page is not Write-Through Required.
=1 The page is Write-Through Required.
I Caching Inhibited (1 bit) See Section 6.2.5.2.
=0 The page is not Caching Inhibited.
=1 The page is Caching Inhibited.
The Valid (V) bit, Effective Page Number (EPN) field, Translation Space Identifier
(TS) bit, Page Size (SIZE) field, and Translation ID (TID) field of a particular TLB
entry identify the page associated with that TLB entry. Except as noted, all com-
parisons must succeed to validate this entry for subsequent translation and
access control processing. Failure to locate a matching TLB entry based on this
criteria for instruction fetches will result in an Instruction TLB Miss exception
type Instruction TLB Error interrupt. Failure to locate a matching TLB entry
based on this criteria for data storage accesses will result in a Data TLB Miss
exception which may result in a Data TLB Error interrupt. Figure 6-1 on page 127
illustrates the criteria for a virtual address to match a specific TLB entry.
There are two address spaces, one typically associated with interrupt-related stor-
age accesses and one typically associated with non-interrupt-related storage
accesses. There are two bits in the Machine State Register, the Instruction
Address Space bit (IS) and the Data Address Space bit (DS), that control which
address space instruction and data storage accesses, respectively, are performed
in, and a bit in the TLB entry (TS) that specifies which address space that TLB
entry is associated with.
Load, Store, Cache Management, Branch, tlbsx[e], and tlbivax[e] instructions and
next-sequential-instruction fetches produce a 64-bit effective address. The virtual
address space is extended from this 64-bit effective address space by prepending
a one-bit address space identifier and a process identifier. For instruction fetches,
the address space identifier is provided by MSRIS and the process identifier is pro-
vided by the contents of the Process ID Register. For data storage accesses, the
This virtual address is used to locate the associated entry in the TLB. The address
space identifier, the process identifier, and the effective address of the storage
access are compared to the Translation Address Space bit (TS), the Translation ID
field (TID), and the value in the Effective Page Number field (EPN), respectively, of
each TLB entry.
- the value of the address specifier for the storage access (MSRIS for
instruction fetches, MSRDS for data storage accesses, and implementa-
tion-dependent source for tlbsx[e] and tlbivax[e]) is equal to the value
of the TS bit of the TLB entry, and
- the contents of bits 0:n–1 of the effective address of the storage or TLB
access are equal to the value of bits 0:n-1 of the EPN field of the TLB
entry (where n=64-log2(page size in bytes) and page size is specified by
the value of the SIZE field of the TLB entry). See Table 6-5.
A TLB Miss exception occurs if there is no valid entry in the TLB for the page spec-
ified by the virtual address (Instruction or Data TLB Error interrupt). Although the
possibility to place multiple entries into the TLB that match a specific virtual
address exists, assuming a set-associative or fully-associative organization, doing
so is a programming error and the results are undefined.
TLBentry[i][TID]n:63 =0?
shared page AS
EA
{ MSRIS for instruction fetches, or
MSRDS for data storage accesses, or
implementation-dependent for tlbsx[e]
& tlbivax[e]
effective address of storage access
{
contents of Process ID Register for
TLBentry[i][EPN]0:N-1 instruction fetches and data
=? Process ID storage accesses, or
EA0:N-1 implementation-dependent for tlbsx[e]
& tlbivax[e]
A program references memory by using the effective address computed by the pro-
cessor when it executes a Load, Store, Cache Management, or Branch instruction,
and when it fetches the next instruction. The effective address is translated to a
real address according to the procedures described in this section. The storage
subsystem uses the real address for the access. All storage access effective
addresses are translated to real addresses using the TLB mechanism. See Figure
6-2.
TLB
multiple-entry
RPN0:53
If the virtual address of the storage access matches a TLB entry in accordance
with the selection criteria specified in Section 6.2.2, the value of the Real Page
Number field (RPN) of the selected TLB entry provides the real page number por-
tion of the real address. Let n=64–log2(page size in bytes) where page size is spec-
ified by the SIZE field of the TLB entry. Bits n:63 of the effective address are
appended to bits 0:n–1 of the 54-bit RPN field of the selected TLB entry to produce
the 64-bit real address (i.e. RA = RPN0:n–1 || EAn:63). The page size is determined
by the value of the SIZE field of the selected TLB entry. See Table 6-6.
The rest of the selected TLB entry provides the access control bits (UX, SX, UW,
SW, UR, SR), and storage attributes (U0, U1, U2, U3, W, I, M, G, E) for the storage
access. The access control bits and storage attribute bits specify whether or not
the access is allowed and how the access is to be performed. See Sections 6.2.4
and 6.2.5.
The Real Page Number field (RPN) of the matching TLB entry provides the transla-
tion for the effective address of the storage access. Based on the setting of the
SIZE field of the matching TLB entry, the RPN field replaces the corresponding
most-significant N bits of the effective address (where N = 64 – log2(page size)), as
shown in Figure 6-6, to produce the 64-bit real address that is to be presented to
main storage to perform the storage access.
After a matching TLB entry has been identified, Book E provides an access control
mechanism for selectively granting shared access, granting execute access, grant-
ing read access, granting write access, and prohibiting access to areas of storage
based on a number of criteria. Figure 6-3 illustrates the access control process
and is described in detail in Sections 6.2.4.1, 6.2.4.2, 6.2.4.3, 6.2.4.4, and
6.2.4.5.
An Execute, Read, or Write Access Control exception occurs if the appropriate TLB
entry is found but the access is not allowed by the access control mechanism
(Instruction or Data Storage interrupt). See Section 7.6 for additional information
about these and other interrupt types. In certain cases, Execute, Read, and Write
Access Control exceptions may result in the restart of (re-execution of at least part
of) a Load or Store instruction.
TLBentry[SX]
TLBentry[SR]
TLBentry[SW]
Instructions may be fetched and executed from a page in storage while in user
state (MSRPR=1) if the UX access control bit for that page is equal to 1. If the UX
access control bit is equal to 0, then instructions from that page will not be
fetched, and will not be placed into any cache as the result of a fetch request to
that page while in user state.
Instructions may be fetched and executed from a page in storage while in supervi-
sor state (MSRPR=0) if the SX access control bit for that page is equal to 1. If the
SX access control bit is equal to 0, then instructions from that page will not be
fetched, and will not be placed into any cache as the result of a fetch request to
that page while in supervisor state.
dcba[e] instructions are treated as Stores since they can change data. As such,
they can cause Write Access Control exceptions. However, such exceptions will
not result in a Data Storage interrupt.
icbi[e] instructions are treated as Loads with respect to protection. As such, they
can cause Read Access Control exception type Data Storage interrupts.
dcbt[e], dcbtst[e], and icbt[e] instructions are treated as Loads with respect to
protection. As such, they can cause Read Access Control exceptions. However,
such exceptions will not result in a Data Storage interrupt.
dcbf[e] and dcbst[e] instructions are treated as Loads with respect to protection.
Flushing or storing a line from the cache is not considered a Store since the store
has already been done to update the cache and the dcbf[e] or dcbst[e] instruction
is only updating the copy in main storage. As a Load, they can cause Read Access
Control exception type Data Storage interrupts.
Read Write
Protection Protection
Instruction
Violation Violation
Exception? Exception?
dcba[e] No Yes2
dcbf[e] Yes No
dcbi[e] No Yes
dcbst[e] Yes No
dcbt[e] Yes1 No
dcbtst[e] Yes1 No
dcbz[e] No Yes
icbi[e] Yes No
icbt[e] Yes1 No
1.dcbt[e], dcbtst[e], & icbt[e] may cause a Read Access Control ex-
ception but does not result in a Data Storage interrupt
2.dcba[e] may cause a Write Access Control exception but does not
result in a Data Storage interrupt
Some operating systems may provide a means to allow programs to specify the
storage attributes described in this section. Because the support provided for
these attributes by the operating system may vary between systems, the details of
the specific system being used must be known before these attributes can be
used.
Storage attributes are associated with units of storage called pages. Each storage
access is performed according to the storage control attributes of the specified
storage location, as described below. The storage control attributes are the follow-
ing.
The W, I, M, G, E, U0, U1, U2, and U3 bits in the TLB entry control the way in
which the processor performs storage accesses in the page associated with the
TLB entry.
Programming Note
The Write-Through Required and Caching Inhibited attributes are mutually exclusive
because, as described below, the Write-Through Required attribute permits the storage
location to be in the data cache while the Caching Inhibited attribute does not.
In the remainder of this chapter, ‘Load instruction’ includes the Cache Manage-
ment and other instructions that are stated in the instruction descriptions to be
‘treated as a Load’, and similarly for ‘Store instruction’.
For storage that is not Memory Coherence Required, software must explicitly
manage memory coherence to the extent required by program correctness. The
operations required to do this may be system-dependent.
Because the Memory Coherence Required attribute for a given storage location is
of little use unless all processors that access the location do so coherently, in
statements about Memory Coherence Required storage elsewhere in this docu-
ment it is generally assumed that the storage has the Memory Coherence
Required attribute for all processors that access it.
In most systems the default is that all storage is Memory Coherence Required. For some
applications in some systems, software management of coherence may yield better per-
formance. In such cases, a program can request that a given unit of storage not be
Memory Coherence Required, and can manage the coherence of that storage by using
the msync instruction, the Cache Management instructions, and services provided by
the operating system.
Engineering Note
Memory coherence can be implemented, for example, by an ownership protocol that
allows at most one processor at a time to store to a given location in Memory Coherence
Required storage.
The ability to disable Memory Coherence Required is provided (see Table 6-4 on
page 124) to allow improved performance in systems in which accesses to storage
kept consistent by hardware are slower than accesses to storage not kept consis-
tent by hardware, and in which software is able to enforce the required consis-
tency. When the Storage attribute is off (M=0), the hardware need not enforce data
coherence for storage accesses initiated by the processor. When the Storage
attribute is on (M=1), the hardware must enforce data coherence for storage
accesses initiated by the processor.
When an access is performed for which data coherence is required, the processor
performing the access must inform the coherence mechanism that the access
requires memory coherence. Other processors affected by the access must
respond to the coherence mechanism. However since the mode control bits have
no direct relation to data or instructions in the cache, processors responding to
the coherence request are able to respond without knowledge of the state of this
bit. Because instruction storage need not be consistent with data storage, it is
permissible for an implementation to ignore the M bit for instruction fetches.
System Note
Entities other than processors can request that their memory transactions obey mem-
ory coherence.
Engineering Note
Treating instruction fetches as non-coherent can result in better performance in an
implementation in which a coherent storage request has greater latency or overhead
than a non-coherent storage request.
A data access to a Guarded storage location is performed only if either the access
is caused by an instruction that is known to be required by the sequential execu-
tion model, or the access is a load and the storage location is already in a cache. If
the storage is also Caching Inhibited, only the storage location specified by the
instruction is accessed; otherwise any storage location in the cache block contain-
ing the specified storage location may be accessed.
Instruction fetch is not affected by Guarded storage. While Book E does not pre-
vent instructions from being fetched out-of-order from Guarded storage, system
software should prevent all instruction fetching from Guarded storage by making
Guarded pages ‘no-execute’ (see Table 6-4 on page 124). Then, if the effective
address of the current instruction is in such storage, an Execute Access Control
type Instruction Storage interrupt is invoked.
Programming Note
In some implementations, instructions may be executed before they are known to be
required by the sequential execution model. Because the results of instructions exe-
cuted in this manner are discarded if it is later determined that those instructions would
not have been executed in the sequential execution model, this behavior does not affect
most programs.
This behavior does affect programs that access storage locations that are not ‘well-
behaved’ (e.g., a storage location that represents a control register on an I/O device that,
when accessed, causes the device to perform an operation). To avoid unintended results,
programs that access such storage locations should request that the storage be
Guarded, and should prevent such storage locations from being in a cache (e.g., by
requesting that the storage also be Caching Inhibited).
Architecture Note
The rules for accessing Guarded storage when an Imprecise mode Floating-Point
Enabled exception is pending should be revisited when Book E is clarified with respect
to those modes. For example, it may be acceptable to require software synchronization
between any instruction that could cause a floating-point enabled exception in Impre-
cise mode and a subsequent instruction that accesses Guarded storage. (A Floating-Point
Status and Control Register instruction might provide sufficient synchronization.)
Big Endian
Note that strings are not multiple-byte scalars but are interpreted as a series of
single-byte scalars. Bytes in a string are loaded from storage, using a Load String
Word instruction, starting at the lowest-numbered address, and placed into the
target register or registers starting at the left-most byte of the least-significant
word. Bytes in a string are stored, using a Store String Word, instruction from the
source register starting at the left-most byte of the least-significant word, and
placed into storage, starting at the lowest numbered address.
Little Endian
Alternatively, if the probing shows that the lowest memory address contains the
lowest-order byte of the multiple-byte scalar, the next higher sequential address
the next most significant byte, and so on, then the multiple-byte object is stored
in Little Endian form.
6.2.5.6 User-Definable
User-definable storage attributes control user-definable and implementation-
dependent behavior of the storage system. These bits are both implementation-
dependent and system-dependent in their effect. They may be used in any combi-
nation and also in combination with the other storage attribute bits. See User’s
Manual for the implementation.
Engineering Note
Some implementations may only support the Endianness storage attribute in a static
manner and may require software assistance as well as a context-synchronizing event
between successive accesses to little-endian and big-endian storage (e.g. support the
Endianness storage attribute as a static mode). These implementations may rely on the
Byte Ordering exception type Data Storage or Instruction Storage interrupt to switch
between ‘little-endian mode’ and ‘big-endian mode’.
Engineering Note
If an implementation uses a ‘MESI’ coherence protocol, a store addressed to a write-
through page may find the addressed cache block in the cache and modified. If so, the
store should update the location in both the cache block and main storage (the normal
operation of a store to Write-Through Required storage). It is acceptable for the imple-
mentation to write the block back to main storage, in which case it can change the state
to ‘unmodified.’ It is also acceptable for the implementation to leave the state of the
cache block ‘modified’ after updating the location in cache and main storage.
Loads, Stores, dcbz[e] instructions, and instruction fetches to the same storage
location using two effective addresses for which the Caching Inhibited storage
attribute (I bit) differs must meet the requirement that a copy of the target loca-
tion of an access to Caching Inhibited storage not be in the cache. Violation of this
requirement is considered a programming error; software must ensure that the
location has not previously been brought into the cache or, if it has, that it has
been flushed from the cache. If the programming error occurs, the result of the
access is boundedly undefined. It is not considered a programming error if the
target location of any other cache management instruction to Caching Inhibited
storage is in the cache.
Accesses to the same storage location using two effective addresses for which the
Guarded storage attribute (G bit) differs are always permitted.
Except for instruction fetches, accesses to the same storage location using two
effective addresses for which the endian storage attribute (E bit) differs are always
permitted. Instruction storage locations must be flushed before the endian stor-
age attribute can be changed for those addresses.
The specification of mismatched user storage attributes (U0 through U3) is imple-
mentation-dependent. See the User’s Manual for the implementation.
Accesses to the same storage location using two effective addresses for which the
memory coherence Storage attribute (M bit) differs may require explicit software
synchronization before accessing the location with M=1 if the location has previ-
ously been accessed with M=0. Any such requirement is system-dependent. For
example, in some ‘snooping bus’ based systems no software synchronization may
be required. In some ‘directory based’ systems, software may be required to exe-
cute dcbf[e] instructions on each processor to flush all storage locations accessed
with M=0 before accessing those locations with M=1.
Book E does not imply any format for the page tables or the page table entries.
Software has significant flexibility in implementing a custom replacement strat-
egy. For example, software may choose to lock TLB entries that correspond to fre-
quently used storage, so that those entries are never cast out of the TLB and TLB
Miss exceptions to those pages never occur. At a minimum, software must main-
tain an entry or entries for the Instruction and Data TLB Error interrupt handlers.
Programming Note
This note suggests one example for managing reference and change recording in a
Book E system.
When performing physical page management, it is useful to know whether a given physi-
cal page has been referenced or altered. Note that this may be more involved than
whether a given TLB entry has been used to reference or alter memory, since multiple
TLB entries may translate to the same physical page. If it is necessary to replace the
contents of some physical page with other contents, a page which has been referenced
(accessed for any purpose) is more likely to be maintained than a page which has never
been referenced. If the contents of a given physical page are to be replaced, then the
contents of that page must be written to the backing store before replacement, if any-
thing in that page has been changed. Software must maintain records to control this
process.
Similarly, when performing TLB management, it is useful to know whether a given TLB
entry has been referenced. When making a decision about which entry to cast-out of the
TLB, an entry which has been referenced is more likely to be maintained in the TLB than
an entry which has never been referenced.
Execute, Read and Write Access Control exceptions may be used to allow software to
maintain reference information for a TLB entry and for its associated physical page. The
entry is built, with its UX, SX, UR, SR, UW, and SW bits off, and the index and effective
page number of the entry retained by software. The first attempt of application code to
use the page will cause an Access Control exception (because the entry is marked ‘No
Execute’, ‘No Read’, and ‘No Write’). The Instruction or Data Storage interrupt handler
records the reference to the TLB entry and to the associated physical page in a software
table, and then turns on the appropriate access control bit. An initial read from the page
could be handled by only turning on the appropriate UR or SR access control bits, leav-
ing the page ‘read-only’. Subsequent execute, read, or write accesses to the page via this
TLB entry will proceed normally.
Write Access Control exceptions may be used to allow software to maintain change infor-
mation for a physical page. For the example just given for reference recording, the first
write access to the page via the TLB entry will create a Write Access Control exception
type Data Storage interrupt. The Data Storage interrupt handler records the change sta-
tus to the physical page in a software table, and then turns on the appropriate UW and
SW bits. All subsequent accesses to the page via this TLB entry will proceed normally.
Architecture Note
All processors in a symmetric multiprocessor must be identical with respect to the cache
model, the coherence block size, and the reservation granule sizes.
The Cache Management instructions obey the sequential execution model except
as described in the example on page 141 of managing coherence between the
instruction and data caches.
Engineering Note
An example of the requirements of the sequential execution model with respect to Cache
Management instructions is that a Load instruction that specifies a storage location in
the block specified by a preceding dcbf[e] instruction must be satisfied from main stor-
age (if the location is in storage that is not Memory Coherence Required) or from
coherent storage (if the location is in storage that is Memory Coherence Required), and
not from the copy of the location that existed in the cache when the dcbf[e] instruction
was executed.
Similar requirements apply to cache reload buffers. For example, if a cache reload
request for a given instruction cache block is pending when an icbi[e] instruction is exe-
cuted specifying the same block, the results of the reload request must not be used to
satisfy a subsequent instruction fetch.
If, at any level of the storage hierarchy, a combined cache is implemented such that
locations in that cache lack an indication of whether they were fetched as data or as
instructions, the locations must be treated as if they were fetched as data. E.g., dcbf[e]
must flush and invalidate them, and icbi[e] must not invalidate them. (Permitting icbi[e]
to invalidate a block that was fetched as data would make icbi[e] act as an user mode
dcbi[e], and thereby create a security and data integrity exposure.)
Programming Note
It is suggested that the operating system provide a service that allows an application
program to obtain the following information.
If the caches are combined, the same value should be given for an instruction cache
attribute and the corresponding data cache attribute.
The instruction cache is not necessarily kept consistent with the data cache or
with main storage. When instructions are modified by processors or by other
mechanisms, software must ensure that the instruction cache is made consistent
with data storage and that the modifications are made visible to the instruction
fetching mechanism. The following instruction sequence can be used to accom-
plish this when the instructions being modified are in storage that is Memory
Coherence Required and one program both modifies the instructions and exe-
cutes them. (Additional synchronization is needed when one program modifies
instructions that another program will execute.) In this sequence, location instr is
assumed to contain instructions that have been modified.
Programming Note
Because the optimal instruction sequence may vary between systems, many operating
systems will provide a system service to perform the function described above.
Engineering Note
Correct operation of the instruction sequence shown above, and of any corresponding
system-dependent sequence, may require that an instruction fetch request not bypass a
writeback of the same storage location caused by the sequence (including a writeback by
another processor).
Programming Note
As stated above, the effective address is translated using translation resources used for
data accesses, even though the block being invalidated was copied into the instruction
cache based on translation resources used for instruction fetches.
While Book E describes logically separate instruction fetch and integer (including
effective address computation) operations, the programming model is that there is
a common translation mechanism. Separate instruction and data TLBs as well as
multi-level TLBs are allowed in Book E at the discretion of the implementation.
7.1 Overview
An interrupt is the action in which the processor saves its old context (Machine
State Register and next instruction address) and begins execution at a pre-deter-
mined interrupt-handler address, with a modified Machine State Register. Excep-
tions are the events that will, if enabled, cause the processor to take an interrupt.
In Book E, exceptions are generated by signals from internal and external periph-
erals, instructions, the internal timer facility, debug events, or error conditions.
All interrupts, except Machine Check, are ordered within the two categories of
non-critical and critical, such that only one interrupt of each category is reported,
and when it is processed (taken) no program state is lost. Since Save/Restore Reg-
ister pairs SRR0/SRR1 and CSRR0/CSRR1 are serially reusable resources used
by all non-critical and critical interrupts respectively, program state may be lost
when an unordered interrupt is taken (see Section 7.8 on page 174).
The contents of Save/Restore Register 0 can be read into GPR(RT) using mfspr
RT,SRR0. The contents of GPR(RS) can be written into Save/Restore Register 0
using mtspr SRR0,RS.
Programming Note
A Machine State Register bit that is reserved may be altered by rfi/rfci.
The contents of Save/Restore Register 1 can be read into bits 32:63 of GPR(RT)
using mfspr RT,SRR1, setting bits 0:31 of GPR(RT) to zero. The contents of bits
32:63 of GPR(RS) can be written into the Save/Restore Register 1 using mtspr
SRR1,RS.
The contents of Critical Save/Restore Register 0 can be read into GPR(RT) using
mfspr RT,CSRR0. The contents of GPR(RS) can be written into Critical Save/
Restore Register 0 using mtspr CSRR0,RS.
Programming Note
A Machine State Register bit that is reserved may be altered by rfi/rfci.
The contents of Critical Save/Restore Register 1 can be read into bits 32:63 of
GPR(RT) using mfspr RT,CSRR1, setting bits 0:31 of GPR(RT) to zero. The contents
of bits 32:63 of GPR(RS) can be written into the Critical Save/Restore Register 1
using mtspr CSRR1,RS.
The Data Exception Address Register (DEAR) is a 64-bit register. Data Exception
Address Register bits are numbered 0 (most-significant bit) to 63 (least-significant
bit). The Data Exception Address Register contains the address that was refer-
enced by a Load, Store or Cache Management instruction that caused an Align-
ment, Data TLB Miss, or Data Storage interrupt.
The contents of Data Exception Address Register can be read into GPR(RT) using
mfspr RT,DEAR. The contents of GPR(RS) can be written into the Data Exception
Address Register using mtspr DEAR,RS.
The Interrupt Vector Prefix Register (IVPR) is a 64-bit register. Interrupt Vector
Prefix Register bits are numbered 0 (most-significant bit) to 63 (least-significant
bit). Bits 48:63 are reserved. Bits 0:47 of the Interrupt Vector Prefix Register pro-
vides the high-order 48 bits of the address of the exception processing routines.
The 16-bit exception vector offsets (provided in Section 7.2.8) are concatenated to
the right of bits 0:47 of the Interrupt Vector Prefix Register to form the 64-bit
address of the exception processing routine.
Associated
Bit(s) Syndrome Interrupt Types
32:35 Allocated
36 PIL Illegal Instruction exception Program
37 PPR Privileged Instruction exception Program
38 PTR Trap exception Program
39 FP Floating-point operation Alignment
Data Storage
Data TLB
Program
40 ST Store operation Alignment
Data Storage
Data TLB Error
41 Reserved
42 DLK0 Cache Locking (implementation-dependent) Data Storage
43 DLK1
44 AP Auxiliary Processor operation Alignment
Data Storage
Data TLB
Program
45 PUO Unimplemented Operation exception Program
46 BO Byte Ordering exception Data Storage
Inst Storage
47 PIE Imprecise exception Program
48:55 Reserved
56:63 Allocated for implementation-dependent use
Programming Note
The information provided by the Exception Syndrome Register is not complete. System
software may also need to identify the type of instruction that caused the interrupt,
examine the TLB entry accessed by a data or instruction storage access, as well as
examining the Exception Syndrome Register to fully determine what exception or excep-
tions caused the interrupt. For example, a Data Storage interrupt may be caused by
both a Protection Violation exception as well as a Byte Ordering exception. System soft-
ware would have to look beyond ESRBO, such as the state of MSRPR in Save/Restore
Register 1 and the page protection bits in the TLB entry accessed by the storage access,
to determine whether or not a Protection Violation also occurred.
The contents of the Exception Syndrome Register can be read into bits 32:63 of
GPR(RT) using mfspr RT,ESR, setting bits 0:31 of GPR(RT) to zero. The contents of
bits 32:63 of GPR(RS) can be written into the Exception Syndrome Register using
mtspr ESR,RS.
The Interrupt Vector Offset Registers (IVORs) are 32-bit registers. Interrupt Vector
Offset Register bits are numbered 32 (most-significant bit) to 63 (least-significant
bit). Bits 32:47 and bits 60:63 are reserved. An Interrupt Vector Offset Register
provides the quadword index from the base address provided by the IVPR (see
Section 7.2.6) for its respective interrupt type. Interrupt Vector Offset Registers 0
through 15 are provided for the defined interrupt types. SPR numbers corre-
sponding to Interrupt Vector Offset Registers 16 through 31 are reserved. SPR
numbers corresponding to Interrupt Vector Offset Registers 32 through 63 are
allocated for implementation-dependent use. Table 7-2 provides the assignments
of specific Interrupt Vector Offset Registers to specific interrupt types.
Bits 48:59 of the contents of IVORi can be read into bits 48:59 of GPR(RT) using
mfspr RT,IVORi, setting bits 0:47 and bits 60:63 of GPR(RT) to zero. Bits 48:59 of
the contents of GPR(RS) can be written into bits 48:59 of IVORi using mtspr
IVORi,RS.
There are two kinds of exceptions, those caused directly by the execution of an
instruction and those caused by an asynchronous event. In either case, the
exception may cause one of several types of interrupts to be invoked.
• the execution of a Trap instruction whose trap condition is met (Trap type
Program interrupt)
The invocation of an interrupt is precise, except that if one of the imprecise modes
for invoking the Floating-point Enabled Exception type Program interrupt is in
effect then the invocation of the Floating-point Enabled Exception type Program
interrupt may be imprecise. When the interrupt is invoked imprecisely, the
excepting instruction does not appear to complete before the next instruction
starts (because one of the effects of the excepting instruction, namely the invoca-
tion of the interrupt, has not yet occurred).
All interrupts, except for Machine Check, can be categorized according to two
independent characteristics of the interrupt:
• Asynchronous/Synchronous
• Critical/Non-critical
Synchronous interrupts are those that are caused directly by the execution (or
attempted execution) of instructions, and are further divided into two classes, pre-
cise and imprecise.
Synchronous, precise interrupts are those that precisely indicate the address of
the instruction causing the exception that generated the interrupt; or, for certain
synchronous, precise interrupt types, the address of the immediately following
instruction.
Synchronous, imprecise interrupts are those that may indicate the address of the
instruction causing the exception that generated the interrupt, or some instruc-
tion after the instruction causing the exception.
• The instruction causing the exception may appear not to have begun execu-
tion (except for causing the exception), may have been partially executed, or
may have completed, depending on the interrupt type. See Section 7.7 on
page 173.
Machine Check interrupts are a special case. They are typically caused by some
kind of hardware or storage subsystem failure, or by an attempt to access an
invalid address. A Machine Check may be caused indirectly by the execution of an
instruction, but not be recognized and/or reported until long after the processor
has executed past the instruction that caused the Machine Check. As such,
Machine Check interrupts cannot properly be thought of as synchronous or asyn-
chronous, nor as precise or imprecise. They are handled as critical class inter-
rupts however. In the case of Machine Check, the following general rules apply:
1. No instruction after the one whose address is reported to the Machine Check
interrupt handler in Critical Save/Restore Register 0 has begun execution.
Associated with each kind of interrupt is an interrupt vector, that is the address of
the initial instruction that is executed when the corresponding interrupt occurs.
Interrupt processing consists of saving a small part of the processor’s state in cer-
tain registers, identifying the cause of the interrupt in another register, and con-
tinuing execution at the corresponding interrupt vector location. When an
exception exists that will cause an interrupt to be generated and it has been deter-
mined that the interrupt can be taken, the following actions are performed, in
order:
• Other defined Machine State Register bits are left unchanged by all
interrupts.
See Section 2.1.1 on page 39 for more detail on the definition of the Machine
State Register.
5. Instruction fetching and execution resumes, using the new Machine State
Register value, at a location specific to the interrupt type. The location is
where IVPR is the Interrupt Vector Prefix Register and IVORi is the Interrupt
Vector Offset Register for that interrupt type (see Table 7-2 on page 147). The
contents of the Interrupt Vector Prefix Register and Interrupt Vector Offset
Registers are indeterminate upon reset, and must be initialized by system
software using the mtspr instruction.
Interrupts do not clear reservations obtained with Load and Reserve instructions.
The operating system should do so at appropriate points, such as at process
switch.
Programming Note
In general, at process switch, due to possible process interlocks and possible data avail-
ability requirements, the operating system needs to consider executing the following.
• msync, to ensure that all storage operations of an interrupted process are complete
with respect to other processors before that process begins executing on another
processor.
• isync, rfi or rfci, to ensure that the instructions in the “new” process execute in the
“new” context.
Table 7-3 provides a summary of each interrupt type, the various exception types
that may cause that interrupt type, the classification of the interrupt, which
Exception Syndrome Register bits can be set, if any, which Machine State Register
bits can mask the interrupt type and which Interrupt Vector Offset Register is
used to specify that interrupt type’s vector address.
Synchronous, Imprecise
Critical
ESR
Page
IVOR Interrupt Type Exception Type (See Note 5)
2. Machine Check interrupts are a special case and are not classified as asyn-
chronous nor synchronous. See Section 7.4.4 on page 151.
3. The Instruction Complete and Branch Taken debug events are only defined for
MSRDE=1 when in Internal Debug Mode (DBCR0IDM=1). In other words, when
in Internal Debug Mode with MSRDE=0, then Instruction Complete and
Branch Taken debug events cannot occur, and no Debug Status Register sta-
tus bits are set and no subsequent imprecise Debug interrupt will occur (see
Section 9.3 on page 201).
Legend:
[xxx] means ESRxxx could be set
[xxx,yyy] means either ESRxxx or ESRyyy may be set, but never both
(xxx,yyy) means either ESRxxx or ESRyyy will be set, but never both
{xxx,yyy} means either ESRxxx or ESRyyy will be set, or possibly both
xxx means ESRxxx is set
9. Software must examine the instruction and the subject TLB entry to deter-
mine the exact cause of the interrupt.
A Critical Input interrupt occurs when no higher priority exception exists (see
Section 7.9 on page 178), a Critical Input exception is presented to the interrupt
mechanism, and MSRCE=1. While the specific definition of a Critical Input excep-
tion is implementation-dependent, it would typically be caused by the activation
of an asynchronous signal that is part of the system. Also, implementations may
provide an alternative means (in addition to MSRCE) for masking the Critical Input
interrupt. See the User’s Manual.
Programming Note
Software is responsible for taking any action(s) that are required by the implementation
in order to clear any Critical Input exception status prior to re-enabling MSRCE in order
to avoid another, redundant Critical Input interrupt.
A Machine Check interrupt occurs when no higher priority exception exists (see
Section 7.9 on page 178), a Machine Check exception is presented to the interrupt
mechanism, and MSRME=1. The specific cause or causes of Machine Check excep-
tions are implementation-dependent, as are the details of the actions taken on a
Machine Check interrupt. See the User’s Manual.
Programming Note
If a Machine Check interrupt is caused by an error in the storage subsystem, the stor-
age subsystem may return incorrect data, that may be placed into registers and/or on-
chip caches.
Programming Note
On implementations that a Machine Check interrupt can be caused by referring to an
invalid real address, executing a dcbz[e] or dcba[e] instruction can cause a delayed
Machine Check interrupt by establishing in the data cache a block that is associated
with an invalid real address. See Section 6.3.2 on page 139. A Machine Check interrupt
can eventually occur if and when a subsequent attempt is made to write that block to
main storage, for example as the result of executing an instruction that causes a cache
miss for which the block is the target for replacement or as the result of executing a
dcbst[e] or dcbf[e] instruction.
A Data Storage interrupt may occur when no higher priority exception exists (see
Section 7.9 on page 178) and a Data Storage exception is presented to the inter-
rupt mechanism. A Data Storage exception is caused when any of the following
exceptions arises during execution of an instruction:
Architecture Note
The Byte Ordering exception is provided to assist implementations that cannot sup-
port dynamically switching byte ordering between consecutive storage accesses,
cannot support the byte order for a class of storage accesses, or cannot support
unaligned storage accesses using a specific byte order.
Programming Note
The icbi[e] and icbt[e] instructions are treated as Loads from the addressed byte with
respect to address translation and protection. These Instruction Cache Management
instructions use MSRDS, not MSRIS, to determine translation for their operands.
Instruction Storage exceptions and Instruction TLB Miss exceptions are associated with
the ‘fetching’ of instructions not with the ‘execution’ of instructions. Data Storage excep-
tions and Data TLB Miss exceptions are associated with the ‘execution’ of Instruction
Cache Management instructions.
When a Data Storage interrupt occurs, the processor suppresses the execution of
the instruction causing the Data Storage exception.
Save/Restore Register 0
Set to the effective address of the instruction causing the Data Storage
interrupt.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Architecture Note
This exception is provided to assist implementations that cannot support dynami-
cally switching byte ordering between consecutive storage accesses, cannot support
the byte order for a class of storage accesses, or cannot support unaligned storage
accesses using a specific byte order.
When an Instruction Storage interrupt occurs, the processor suppresses the exe-
cution of the instruction causing the Instruction Storage exception.
Save/Restore Register 0
Set to the effective address of the instruction causing the Instruction Stor-
age interrupt.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Programming Note
Protection Violation and Byte Ordering exceptions are not mutually exclusive.
Even if ESRBO is set, system software must also examine the TLB entry
accessed by the instruction fetch to determine whether or not a Protection Viola-
tion may have also occurred.
An External Input interrupt occurs when no higher priority exception exists (see
Section 7.9 on page 178), an External Input exception is presented to the inter-
rupt mechanism, and MSREE=1. While the specific definition of an External Input
exception is implementation-dependent, it would typically be caused by the acti-
vation of an asynchronous signal that is part of the processing system. Also,
implementations may provide an alternative means (in addition to MSREE) for
masking the External Input interrupt. See the User’s Manual.
Save/Restore Register 0
Set to the effective address of the next instruction to be executed.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Programming Note
Software is responsible for taking whatever action(s) are required by the implementation
in order to clear any External Input exception status prior to re-enabling MSREE in
order to avoid another, redundant External Input interrupt.
For lmw and stmw with an operand that is not word-aligned, and for Load and
Reserve and Store Conditional instructions with an operand that is not aligned, an
implementation may yield boundedly undefined results instead of causing an
Alignment interrupt. A Store Conditional to Write Through Required storage may
either cause a Data Storage interrupt, cause an Alignment interrupt, or correctly
execute the instruction. For all other cases listed above, an implementation may
execute the instruction correctly instead of causing an Alignment interrupt. (For
dcbz[e], ‘correct’ execution means setting each byte of the block in main storage to
0x00.)
Programming Note
The architecture does not support the use of an unaligned effective address by Load and
Reserve and Store Conditional instructions. If an Alignment interrupt occurs because
one of these instructions specifies an unaligned effective address, the Alignment inter-
rupt handler must not attempt to emulate the instruction, but instead should treat the
instruction as a programming error.
Save/Restore Register 0
Set to the effective address of the instruction causing the Alignment
interrupt.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
– a reserved-illegal instruction
– a privileged instruction
Trap exception
A Trap exception occurs when any of the conditions specified in a Trap
instruction are met and the exception is not also enabled as a Debug
interrupt. If enabled as a Debug interrupt (i.e. DBCR0TRAP=1, DBCR0IDM=1,
Save/Restore Register 0
For all Program interrupts except an Enabled exception when in one of
the imprecise modes (see Section 2.1.1 on page 39) or when a disabled
exception is subsequently enabled, set to the effective address of the
instruction that caused the Program interrupt.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Engineering Note
Supporting the Imprecise Recoverable Mode Floating-Point Enabled exception type Pro-
gram interrupt as a precise interrupt may be convenient for some implementations.
However, if the Imprecise Recoverable Mode Floating-Point Enabled exception type Pro-
gram interrupt is implemented as an imprecise interrupt, the hardware must provide, at
the minimum, the address at which to resume the interrupted process (this is given in
Save/Restore Register 0), the excepting instruction’s opcode, extended opcode, and
record bit, the source values or registers, and the target register. This information can
be provided directly in registers or by means of a pointer to the excepting instruction.
See the User’s Manual for the implementation.
Save/Restore Register 0
Set to the effective address of the instruction that caused the interrupt.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
A System Call interrupt occurs when no higher priority exception exists (see
Section 7.9 on page 178) and a System Call (sc) instruction is executed.
Save/Restore Register 0
Set to the effective address of the instruction after the sc instruction.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Save/Restore Register 0
Set to the effective address of the instruction that caused the interrupt.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Note
MSREE also enables the External Input and Fixed-Interval Timer interrupts.
Save/Restore Register 0
Set to the effective address of the next instruction to be executed.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Programming Note
Software is responsible for clearing the Decrementer exception status prior to re-
enabling the MSREE bit in order to avoid another redundant Decrementer interrupt. To
clear the Decrementer exception, the interrupt handling routine must clear TSRDIS.
Clearing is done by writing a word to TSR using mtspr with a 1 in any bit position that
is to be cleared and 0 in all other bit positions. The write-data to the TSR is not direct
data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect.
Note
MSREE also enables the External Input and Decrementer interrupts.
Save/Restore Register 0
Set to the effective address of the next instruction to be executed.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
Programming Note
Software is responsible for clearing the Fixed-Interval Timer exception status prior to re-
enabling the MSREE bit in order to avoid another redundant Fixed-Interval Timer inter-
rupt. To clear the Fixed-Interval Timer exception, the interrupt handling routine must
clear TSRFIS. Clearing is done by writing a word to TSR using mtspr with a 1 in any bit
position that is to be cleared and 0 in all other bit positions. The write-data to the TSR is
not direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect.
A Watchdog Timer interrupt occurs when no higher priority exception exists (see
Section 7.9 on page 178), a Watchdog Timer exception exists (TSRWIS=1), and the
interrupt is enabled (i.e. TCRWIE=1 and MSRCE=1). See Section 8.7 on page 196.
Note
MSRCE also enables the Critical Input interrupt.
Programming Note
Software is responsible for clearing the Watchdog Timer exception status prior to re-
enabling the MSRCE bit in order to avoid another redundant Watchdog Timer interrupt.
To clear the Watchdog Timer exception, the interrupt handling routine must clear TSR-
WIS. Clearing is done by writing a word to TSR using mtspr with a 1 in any bit position
that is to be cleared and 0 in all other bit positions. The write-data to the TSR is not
direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect.
A Data TLB Error interrupt occurs when no higher priority exception exists (see
Section 7.9 on page 178) and any of the following Data TLB Error exceptions is
presented to the interrupt mechanism.
When a Data TLB Error interrupt occurs, the processor suppresses the execution
of the instruction causing the Data TLB Error interrupt.
Save/Restore Register 0
Set to the effective address of the instruction causing the Data TLB Error
interrupt
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
When an Instruction TLB Error interrupt occurs, the processor suppresses the
execution of the instruction causing the Instruction TLB Miss exception.
Save/Restore Register 0
Set to the effective address of the instruction causing the Instruction TLB
Error interrupt.
Save/Restore Register 1
Set to the contents of the Machine State Register at the time of the
interrupt.
A Debug interrupt occurs when no higher priority exception exists (see Section 7.9
on page 178), a Debug exception exists in the Debug Status Register, and Debug
interrupts are enabled (DBCR0IDM=1 and MSRDE=1). A Debug exception occurs
when a Debug Event causes a corresponding bit in the Debug Status Register to
be set. See Chapter 9 on page 199.
– For Interrupt Taken (IRPT) debug exceptions, set to the interrupt vec-
tor value of the interrupt that caused the Interrupt Taken debug event.
– For Return From Interrupt (RET) debug exceptions, set to the address
of the rfi or rfci instruction that caused the Debug interrupt.
For Debug exceptions that occur while Debug interrupts are disabled
(DBCR0IDM=0 or MSRDE=0), a Debug interrupt will occur at the next syn-
chronizing event if DBCR0IDM and MSRDE are modified such that they are
both 1 and if the Debug exception Status is still set in the Debug Status
Register. When this occurs, Critical Save/Restore Register 0 is set to the
address of the instruction that would have executed next, not with the
address of the instruction that modified the Debug Control Register 0 or
Machine State Register and thus caused the interrupt.
In general, the architecture permits load and store instructions to be partially exe-
cuted, interrupted, and then to be restarted from the beginning upon return from
the interrupt. Unaligned Load and Store instructions, or Load Multiple, Store Multi-
ple, Load String, and Store String instructions may be broken up into multiple,
smaller accesses, and these accesses may be performed in any order. In order to
guarantee that a particular load or store instruction will complete without being
interrupted and restarted, software must mark the storage being referred to as
Guarded, and must use an elementary (non-string or non-multiple) load or store
that is aligned on an operand-sized boundary.
• For ‘with update’ forms of Load or Store, the update register, GPR(RA), will not
have been altered.
On the other hand, the following effects are permissible when certain instructions
are partially executed and then restarted:
• For any Store, some of the bytes at the target storage location may have been
altered (if write access to that page in which bytes were altered is permitted by
the access control mechanism). In addition, for Store Conditional instruc-
tions, CR0 has been set to an undefined value, and it is undefined whether
the reservation has been cleared.
• For any Load, some of the bytes at the addressed storage location may have
been accessed (if read access to that page in which bytes were accessed is
permitted by the access control mechanism).
• For Load Multiple or Load String, some of the registers in the range to be
loaded may have been altered. Including the addressing registers (GPR(RA),
and possibly GPR(RB)) in the range to be loaded is a programming error, and
thus the rules for partial execution do not protect against overwriting of these
registers.
As previously stated, the only load or store instructions that are guaranteed to not
be interrupted after being partially executed are elementary, aligned, guarded
loads and stores. All others may be interrupted after being partially executed. The
following list identifies the specific instruction types for which interruption after
partial execution may occur, as well as the specific interrupt types that could
cause the interruption:
3. mtcrf may also be partially executed due to the occurrence of any of the inter-
rupts listed under item 1 at the time the mtcrf was executing.
Architectural Note
As is the case with this example, when an otherwise synchronous, precise interrupt type
is “delayed” in this fashion via masking, and the interrupt type is later enabled, the
interrupt that is then generated due to the exception event that occurred while the inter-
rupt type was disabled is then considered a synchronous, imprecise class of interrupt.
However, this particular category of synchronous, imprecise interrupt is not generally
discussed in other sections of this document. Rather, the discussion of synchronous,
imprecise interrupts is generally limited to those specific interrupt types that are defined
to be imprecise to begin with, and not those that are delayed versions of otherwise syn-
chronous, precise interrupts.
This first step of clearing MSREE (and MSRCE,DE for critical class interrupts) pre-
vents any subsequent asynchronous interrupts from overwriting the Save/
Restore Registers (SRR0/SRR1 or CSRR0/CSRR1), prior to software being able to
save their contents. Hardware also automatically clears, on any interrupt,
MSRWE,PR,FP,FE0,FE1,IS,DS. The clearing of these bits assists in the avoidance of
subsequent interrupts of certain other types. However, guaranteeing that these
interrupt types do not occur and thus do not overwrite the Save/Restore Registers
(SRR0/SRR1 or CSRR0/CSRR1) also requires the cooperation of system software.
Specifically, system software must avoid the execution of instructions that could
cause (or enable) a subsequent interrupt, if the contents of the Save/Restore Reg-
isters (SRR0/SRR1 or CSRR0/CSRR1) have not yet been saved.
The following list identifies the actions that system software must avoid, prior to
having saved the Save/Restore Registers’ contents:
This prevents any asynchronous interrupts, as well as (in the case of MSRDE)
any Debug interrupts (which include both synchronous and asynchronous
types).
• Execution of System Call (sc) or Trap (tw, twi, td, tdi) instructions
This prevents System Call and Trap exception type Program interrupts.
• Re-enabling of MSRPR
This prevents Alignment interrupts. Included in this category are any string or
multiple instructions, and any unaligned elementary load or store instruc-
tions. See Section 7.6.6 on page 161 for a complete list of instructions that
may cause Alignment interrupts.
Machine Check interrupts are a special case. Machine Checks are critical class
interrupts, but normal critical class interrupts (Critical Input, Watchdog Timer
and Debug) do not automatically disable Machine Checks. Machine Checks are
disabled by clearing the MSRME bit, and only a Machine Check interrupt itself
automatically clears this bit. Thus there is always the risk that a Machine Check
interrupt could occur within a normal, critical interrupt handler, prior to the
Save/Restore Registers’ contents having been saved. In such a case, the interrupt
may not be recoverable.
It is not necessary for hardware or software to avoid critical class interrupts from
within non-critical class interrupt handlers (and hence hardware does not auto-
matically clear MSRCE,ME,DE upon a non-critical interrupt), since the two classes
of interrupts use different pairs of Save/Restore Registers to save the instruction
address and Machine State Register (SRR0/SRR1 for non-critical, and CSRR0/
CSRR1 for critical). The converse, however, is not true. That is, hardware and soft-
ware must cooperate in the avoidance of both critical and non-critical class inter-
rupts from within critical class interrupt handlers, even though the two classes of
interrupts use different Save/Restore Register pairs. This is because the critical
class interrupt may have occurred from within a non-critical handler, prior to the
non-critical handler having saved the non-critical pair of Save/Restore Registers.
Therefore, within the critical class interrupt handler, both pairs of Save/Restore
Registers may contain data that is necessary to the system software.
The following is a prioritized listing of the various enabled interrupt types for
which exceptions might exist simultaneously:
Data Storage
Instruction Storage
Alignment
Program
Floating-Point Unit Unavailable
Auxiliary Processor Unavailable
System Call
Data TLB Error
Instruction TLB Error
Only one of the above types of synchronous interrupts may have an existing
exception generating it at any given time. This is guaranteed by the exception
priority mechanism (see Section 7.9 on page 178) and the requirements of the
Sequential Execution Model.
2. Machine Check
3. Debug
4. Critical Input
5. Watchdog Timer
6. External Input
7. Fixed-Interval Timer
8. Decrementer
For any single instruction attempting to cause multiple exceptions for which the
corresponding synchronous interrupt types are enabled, this section defines the
priority order by which the instruction will be permitted to cause a single enabled
exception, thus generating a particular synchronous interrupt. Note that it is this
exception priority mechanism, along with the requirement that synchronous
interrupts be generated in program order, that guarantees that at any given time,
there exists for consideration only one of the synchronous interrupt types listed in
item 1 of Section 7.8.2 on page 176. The exception priority mechanism also pre-
vents certain debug exceptions from existing in combination with certain other
synchronous interrupt-generating exceptions.
Because unaligned Load and Store instructions, or Load Multiple, Store Multiple,
Load String, and Store Sting instructions may be broken up into multiple, smaller
accesses, and these accesses may be performed in any order. The exception prior-
ity mechanism applies to each of the multiple storage accesses in the order they
are performed by the implementation.
This section does not define the permitted setting of multiple exceptions for which
the corresponding interrupt types are disabled. The generation of exceptions for
which the corresponding interrupt types are disabled will have no effect on the
generation of other exceptions for which the corresponding interrupt types are
enabled. Conversely, if a particular exception for which the corresponding inter-
rupt type is enabled is shown in the following sections to be of a higher priority
than another exception, it will prevent the setting of that other exception, inde-
pendent of whether that other exception’s corresponding interrupt type is enabled
or disabled.
Except as specifically noted, only one of the exception types listed for a given
instruction type will be permitted to be generated at any given time. The priority of
the exception types are listed in the following sections ranging from highest to
lowest, within each instruction type.
Note
Some exception types may even be mutually exclusive of each other and could other-
wise be considered the same priority. In these cases, the exceptions are listed in the
order suggested by the sequential execution model.
For mtmsr, mtspr (DBCR0, DBCR1, DBCR2), mtspr (TCR), and mtspr (TSR), if
they are not causing Debug (Instruction Address Compare) nor Program (Privi-
leged Instruction) exceptions, it is possible that they are simultaneously enabling
(via mask bits) multiple existing exceptions (and at the same time possibly caus-
ing a Debug (Instruction Complete) exception). When this occurs, the interrupts
will be handled in the order defined by Section 7.8.2 on page 176.
If the rfi or rfci instruction is causing both a Debug (Instruction Address Com-
pare) and a Debug (Return From Interrupt), and is not causing any of the excep-
tions listed in items 2-5, it is permissible for both exceptions to be generated and
recorded in the Debug Status Register. A single Debug interrupt will result.
The following prioritized list of exceptions may occur as a result of the attempted
execution of any reserved instruction.
8.1 Overview
The Time Base (TB), Decrementer (DEC), Fixed-Interval Timer (FIT), and Watchdog
Timer (WDT) provide timing functions for the system. All of these must be initial-
ized during start-up.
• The Decrementer, a counter that is updated at the same rate as the Time
Base, provides a means of signaling an exception after a specified amount of
time has elapsed unless:
• The Fixed-Interval Timer is really a selected bit of the Time Base, which pro-
vides a means of signalling an exception whenever the selected bit transitions
from 0 to 1, in a repetitive fashion. The Fixed-Interval Timer is typically used
to trigger periodic system maintenance functions. Software may select one of
four bits in the Time Base to serve as the Fixed-Interval Timer. Which bits
may be selected is implementation-dependent.
• The Watchdog Timer is also a selected bit of the Time Base, which provides a
means of signalling a critical class exception whenever the selected bit transi-
tions from 0 to 1. In addition, if software does not respond in time to the
initial exception (by clearing the associated status in the Timer Status Regis-
ter (TSR) prior to the next expiration of the Watchdog Timer interval), then a
Watchdog Timer-generated processor reset may result, if so enabled. The
Watchdog Timer is typically used to provide a system error recovery function.
The relationship of these Timer facilities to each other is illustrated in Figure 8-1
below.
(decrementer)
DEC
Decrementer event ⇐ 0/1 detect auto-reload
DECAR
0 31
The Timer Control Register (TCR) is a 32-bit register. Timer Control Register bits
are numbered 32 (most-significant bit) to 63 (least-significant bit). The Timer Con-
trol Register controls Decrementer (see Section 8.5), Fixed-Interval Timer (see
Section 8.6), and Watchdog Timer (see Section 8.7) options. Table 8-1 specifies
the bit definitions of the Timer Control Register.
The contents of the Timer Control Register can be read into bits 32:63 of a
GPR(RT) using mfspr RT,TCR, setting bits 0:31 of GPR(RT) to zero. The contents of
bits 32:63 of GPR(RS) can be written to the Timer Control Register using mtspr
TCR,RS.
Bit(s) Description
32:33 Watchdog Timer Period (WP) (See Section 8.7)
Specifies one of 4 bit locations of the Time Base used to signal a Watchdog Timer
exception on a transition from 0 to 1. The 4 Time Base bits that can be specified to
serve as the Watchdog Timer period are implementation-dependent.
The Timer Status Register (TSR) is a 32-bit register. Timer Status Register bits are
numbered 32 (most-significant bit) to 63 (least-significant bit). The Timer Status
Register contains status on timer events and the most recent Watchdog Timer-ini-
tiated processor reset.
The Timer Status Register is set via hardware, and read and cleared via software.
The contents of the Timer Status Register can be read into bits 32:63 of a GPR(RT)
using mfspr RT,TSR, setting bits 0:31 of GPR(RT) to zero. Bits in the Timer Status
Register can be cleared using mtspr TSR,RS. Clearing is done by writing bits 32:63
of a General Purpose Register to the Timer Status Register with a 1 in any bit
position that is to be cleared and 0 in all other bit positions. The write-data to the
Timer Status Register is not direct data, but a mask. A 1 causes the bit to be
cleared, and a 0 has no effect.
Bit(s) Description
8.4.1 Overview
The Time Base (TB) is composed of two 32-bit registers, the Time Base Upper
(TBU) concatenated on the right with the Time Base Lower (TBL). Time Base
Upper bits are numbered 32 (most-significant bit) to 63 (least-significant bit).
Time Base Lower bits are numbered 32 (most-significant bit) to 63 (least-signifi-
cant bit). The Time Base is interpreted as a 64-bit unsigned integer that is incre-
mented periodically. Each increment adds 1 to the least-significant bit. The
frequency at which the integer is updated is implementation-dependent.
The Time Base provides timing functions for the system. The Time Base is a vola-
tile resource and must be initialized during start-up.
The contents of the Time Base Upper can be read into bits 32:63 of a General Pur-
pose Register using mfspr RT,TBU, setting bits 0:31 of GPR(RT) to an undefined
value. The contents of bits 32:63 of GPR(RS) can be written to the Time Base
Upper using mtspr TBU,RS.
The contents of the Time Base Lower can be read into bits 32:63 of a General Pur-
pose Register using mfspr RT,TBL, setting bits 0:31 of GPR(RT) to an undefined
value. The contents of bits 32:63 of GPR(RS) can be written to the Time Base
Lower using mtspr TBL,RS.
There is no automatic initialization of the Time Base; system software must per-
form this initialization.
The Time Base Lower increments until its value becomes 0xFFFF_FFFF (232–1). At
the next increment, its value becomes 0x0000_0000 and Time Base Upper is
incremented. This process continues until the value in the Time Base Upper
becomes 0xFFFF_FFFF and the value in the Time Base Lower becomes
0xFFFF_FFFF, or the value in the Time Base is interpreted as
0xFFFF_FFFF_FFFF_FFFF (264–1). At the next increment, the value in the Time
Base Upper becomes 0x0000_0000 and the value in the Time Base Lower
becomes 0x0000_0000. There is no interrupt or other indication when this
occurs.
The period of the Time Base depends on the driving frequency. As an order of
magnitude example, suppose that the Time Base is driven by a frequency of
100MHz divided by 32. Then the period of the Time Base would be
The Time Base must be implemented such that the following requirements are
satisfied.
• Loading a General Purpose Register from the Time Base shall have no effect
on the accuracy of the Time Base.
Book E does not specify a relationship between the frequency at which the Time
Base is updated and other frequencies, such as the CPU clock or bus clock in a
Book E system. The Time Base update frequency is not required to be constant.
What is required, so that system software can keep time of day and operate inter-
val timers, is one of the following.
• The update frequency of the Time Base is under the control of the system
software.
Implementations must provide a means for either preventing the Time Base from
incrementing or preventing the Time Base from being read in problem state
(MSRPR=1). If the means is under software control, the Time Base must be acces-
sible only in privileged state (MSRPR=0).
Architecture Note
Disabling the Time Base or making reading the Time Base privileged prevents the Time
Base from being used to implement a ‘covert channel’ in a secure system.
The requirements stated above for the Time Base apply also to any other SPRs that mea-
sure time and can be read in problem state (e.g., Performance Monitor registers).
Programming Note
If the operating system initializes the Time Base on power-on to some reasonable value
and the update frequency of the Time Base is constant, the Time Base can be used as a
source of values that increase at a constant rate, such as for time stamps in trace
entries.
Even if the update frequency is not constant, values read from the Time Base are mono-
tonically increasing (except when the Time Base wraps from 264–1 to 0). If a trace entry
is recorded each time the update frequency changes, the sequence of Time Base values
can be post-processed to become actual time values.
See Section 8.4.4 on page 191 for ways to compute time of day in POSIX format from the
Time Base.
Architecture Note
It is intended that the Time Base be useful for timing reasonably short sequences of
code (a few hundred instructions) and for low-overhead time stamps for tracing. The
Time Base should not ‘tick’ faster than the CPU instruction clock. Driving the Time Base
directly from the CPU instruction clock is probably finer granularity than necessary; the
instruction clock divided by 8, 16, or 32 would be more appropriate.
It is not possible to write the entire 64-bit Time Base using a single instruction.
The Time Base can be written by a sequence such as:
Provided that no interrupts occur while the last three instructions are being exe-
cuted, loading 0 into Time Base Lower prevents the possibility of a carry from
Time Base Lower to Time Base Upper while the Time Base is being initialized.
It is not possible to read the entire 64-bit Time Base in a single instruction. mfspr
RT,TBL moves from the lower half of the Time Base (TBL) to a GPR, and mfspr
RT,TBU extended mnemonic moves from the upper half (TBU) to a GPR. Because
of the possibility of a carry from Time Base Lower to Time Base Upper occurring
between reads of Time Base Lower and Time Base Upper, a sequence such as the
following is necessary to read the Time Base.
loop:
mfspr Rx,TBU #load from TBU
mfspr Ry,TBL #load from TBL
mfspr Rz,TBU #load from TBU
cmp cr0,0,Rz,Rx #see if 'old' = 'new'
bc 4,2,loop #loop if carry occurred
The comparison and loop are necessary to ensure that a consistent pair of values
has been obtained.
Assume that:
1. Described in POSIX Draft Standard P1003.4/D12, Draft Standard for Information Technology -- Portable Operating System
Interface (POSIX) -- Part 1: System Application Program Interface (API) - Amendment 1: Real-time Extension [C Language].
Institute of Electrical and Electronics Engineers, Inc., Feb. 1992.
100MHz / 32 = 3,125,000
which is the number of times the Time Base is updated each second.
The POSIX clock can be computed with an instruction sequence such as this:
In the absence of a divd instruction (see Appendix A, “Guidelines for 32-bit Book
E”, on page 371), direct implementation of the algorithm given above is awkward.1
Such division can be avoided entirely if a time of day clock in POSIX format is
updated at least once each second.
Assume that:
These variables hold the value of the Time Base and the computed POSIX sec-
ond and nanosecond values from the last time the POSIX clock was
computed.
1. See D. E. Knuth, The Art of Computer Programming, Volume 2, Seminumerical Algorithms, Section 4.3.1, Algorithm D. Addi-
son-Wesley, 1981.
The POSIX clock can be computed with an instruction sequence such as this:
loop:
mfspr Rx,TBU #Rx = TBU
mfspr Ry,TBL #Ry = TBL
mfspr Rz,TBU #Rz = 'new' TBU value
cmp CR0,0,Rz,Rx #see if 'old' = 'new'
bc 4,2,loop #loop if carry occurred
# now have 64-bit TB in Rx and Ry
lwz Rz,posix_tb+4
sub Rz,Ry,Rz #Rz = delta in ticks
lwz Rw,ns_adj
mullw Rz,Rz,Rw #Rz = delta in ns
lwz Rw,posix_ns
add Rz,Rz,Rw #Rz = new ns value
lwz Rw,billion
cmp CR0,0,Rz,Rw #see if past 1 second
bc 12,0,nochange #branch if not
sub Rz,Rz,Rw #adjust nanoseconds
lwz Rw,posix_sec
addi Rw,Rw,1 #adjust seconds
stw Rw,posix_sec #store new seconds
nochange:
stw Rz,posix_ns #store new ns
stw Rx,posix_tb #store new time base
stw Ry,posix_tb+4
Note that the upper half of the Time Base does not participate in the calculation to
determine the new POSIX time of day. This is correct as long as the time change
does not exceed one second.
In a system in which the update frequency of the Time Base may change over
time, it is not possible to convert an isolated Time Base value into time of day.
Instead, a Time Base value has meaning only with respect to the current update
frequency and the time of day that the update frequency was last changed. Each
time the update frequency changes, either the system software is notified of the
change via an interrupt (see Chapter 7 on page 143), or the change was instigated
by the system software itself. At each such change, the system software must
compute the current time of day using the old update frequency, compute a new
value of ticks_per_sec for the new frequency, and save the time of day, Time Base
value, and tick rate. Subsequent calls to compute time of day use the current
Time Base value and the saved data.
The contents of the Decrementer can be read into bits 32:63 of a General Purpose
Register using mfspr RT,DEC, setting bits 0:31 of GPR(RT) to zero. The contents of
bits 32:63 of GPR(RS) can be written to the Decrementer using mtspr DEC,RS.
The contents of the Decrementer Auto-Reload Register cannot be read. The con-
tents of bits 32:63 of GPR(RS) can be written to the Decrementer Auto-Reload
Register using mtspr DECAR,RS.
The Decrementer decrements at the same rate that the Time Base increments. A
Decrementer event occurs when a decrement occurs on a Decrementer value of
0x0000_0001. Upon the occurrence of a Decrementer event, the Decrementer has
the following basic modes of operation.
The Decrementer interrupt handler must reset TSRDIS in order to avoid taking
another, redundant Decrementer interrupt once MSREE is re-enabled (assuming
TCRDIE is not cleared instead). This is done by writing a word to Timer Status
Register using mtspr with a 1 in the bit corresponding to TSRDIS (and any other
bits that are to be cleared) and 0 in all other bits. The write-data to the Timer Sta-
tus Register is not direct data, but a mask. A 1 causes the bit to be cleared, and a
0 has no effect.
Forcing the Decrementer to 0 using the mtspr instruction will not cause a Decre-
menter exception; however, decrementing which was in progress at the instant of
the mtspr may cause the exception. To eliminate the Decrementer as a source of
exceptions, set TCRDIE to 0 (clear the Decrementer Interrupt Enable bit).
4. Write 1 to TSRDIS. This action will clear TSRDIS to 0 (see Section 8.3 on
page 188). This will clear any Decrementer exception which may be pending.
Because the Decrementer is frozen at zero, no further Decrementer events are
possible.
If the auto-reload feature is disabled (TCRARE=0), then once the Decrementer dec-
rements to zero, it will stay there until software reloads it using the mtspr
instruction.
Book E has changed the definition of the Decrementer from the PowerPC Architec-
ture definition. However, Book E permits implementations of the architecture to
provide implementation-dependent facilities to enable emulation of the PowerPC
Architecture definition of the Decrementer for compatibility with legacy systems.
There are a number of approaches to providing this compatibility support, and
hence Book E allows implementation-dependent discretion. A new mode which
causes the Decrementer to operate in the PowerPC-compatible mode would be
required by implementations that support such compatibility. A mode bit that
enables this new mode is recommended to be placed in the Timer Control Regis-
ter. The PowerPC Decrementer operates continuously, wrapping from 0 to
0xFFFF_FFFF and generates an interrupt on every transition of bit 0 of the Decre-
menter from 0 to 1, including such a transition caused by a mtspr DEC,RS. A
Book E implementation that supports this PowerPC Decrementer mode would be
required to also provide facilities that enable the Book E Decrementer behavior.
The Fixed-Interval Timer (FIT) is a mechanism for providing timer interrupts with
a repeatable period, to facilitate system maintenance. It is similar in function to
an auto-reload Decrementer, except that there are fewer selections of interrupt
period available. The Fixed-Interval Timer exception occurs on 0 to 1 transitions
of a selected bit from the Time Base (see Section 8.2 on page 186).
The interrupt handler must reset TSRFIS via software, in order to avoid another
Fixed-Interval Timer interrupt once MSREE is re-enabled (assuming TCRFIE is not
cleared instead). This is done by writing a word to the Timer Status Register using
mtspr with a 1 in the bit corresponding to TSRFIS (and any other bits that are to
be cleared) and 0 in all other bits. The write-data to the Timer Status Register is
not direct data, but a mask. A 1 causes the bit to be cleared, and a “0” has no
effect.
The Watchdog Timer is a facility intended to aid system recovery from faulty soft-
ware or hardware. Watchdog time-outs occur on 0 to 1 transitions of selected bits
from the Time Base (see Section 8.2 on page 186).
When a Watchdog Timer time-out occurs while Watchdog Timer Interrupt Status
is clear (TSRWIS = 0) and the next Watchdog Time-out is enabled (TSRENW = 1), a
Watchdog Timer exception is generated and logged by setting TSRWIS to 1. This is
referred to as a Watchdog Timer First Time Out. A Watchdog Timer interrupt will
occur if enabled by TCRWIE and MSRCE. See Section 7.6.13 on page 169 for
details of register behavior caused by the Watchdog Timer interrupt.
The interrupt handler must reset TSRWIS via software, in order to avoid another
Watchdog Timer interrupt once MSRCE is re-enabled (assuming TCRWIE is not
cleared instead). This is done by writing a word to the Timer Status Register using
mtspr with a 1 in the bit corresponding to TSRWIS (and any other bits that are to
be cleared) and a 0 in all other bits. The write-data to the Timer Status Register is
not direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect.
Note that a Watchdog Timer exception will also occur if the selected Time Base bit
transitions from 0 to 1 due to an mtspr TBL,RS that writes a 1 to the bit when its
previous value was 0.
(2) SW Loop
TSRENW,WIS=0b00 TSRENW,WIS=0b10
(1) Watchdog
Interrupt Time-out. WDT exception recorded in TSRWIS
Handler WDT interrupt will occur if enabled by
(3) SW Loop
TCRWIE and MSRCE
(2) Watchdog
Interrupt
Handler
Enable Next
WDT WDT Status
(TSRENW) (TSRWIS) Action when timer interval expires
The controls described in the above table imply three different modes of operation
that a programmer might select for the Watchdog Timer. Each of these modes
assumes that TCRWRC has been set to allow processor reset by the Watchdog facil-
ity:
1. Always take the Watchdog Timer interrupt when pending, and never attempt
to prevent its occurrence. In this mode, the Watchdog Timer interrupt caused
by a first time-out is used to clear TSRWIS so a second time-out never occurs.
TSRENW is not cleared, thereby allowing the next time-out to cause another
interrupt.
2. Always take the Watchdog Timer interrupt when pending, but avoid when
possible. In this mode a recurring code loop of reliable duration (or perhaps a
periodic interrupt handler such as the Fixed-Interval Timer interrupt han-
dler) is used to repeatedly clear TSRENW such that a first time-out exception is
avoided, and thus no Watchdog Timer interrupt occurs. Once TSRENW has
been cleared, software has between one and two full Watchdog periods before
a Watchdog exception will be posted in TSRWIS. If this occurs before the soft-
3. Never take the Watchdog Timer interrupt. In this mode, Watchdog Timer
interrupts are disabled (via TCRWIE=0), and the system depends upon a recur-
ring code loop of reliable duration (or perhaps a periodic interrupt handler
such as the Fixed-Interval Timer interrupt handler) to repeatedly clear TSR-
WIS such that a second time-out is avoided, and thus no reset occurs. TSRENW
is not cleared, thereby allowing the next time-out to set TSRWIS again. The
recurring code loop must have a period which is less than one Watchdog
Timer period in order to guarantee that a Watchdog Timer reset will not occur.
The debug mechanism provides a means of temporarily freezing the timers upon a
debug event. Specifically, the Time Base and Decrementer can be frozen and pre-
vented from incrementing/decrementing, respectively, whenever a debug event is
set in the Debug Status Register. This allows a debugger to simulate the appear-
ance of ‘real time’, even though the application has been temporarily ‘halted’ to
service the debug event. See the description of bit 63 of the Debug Control Regis-
ter 0 (Freeze Timers on Debug Event or DBCR0FT) in Table 9-1 on page 210.
9.1 Background
Book E provides debug facilities to enable hardware and software debug func-
tions, such as instructions and data breakpoints and program single stepping.
The debug facilities consist of a set of debug control registers (DBCR0, DBCR1,
and DBCR2) described in Section 9.4.1, a set of address and data value compare
registers (IAC1, IAC2, IAC3, IAC4, DAC1, DAC2, DVC1, and DVC2) described in
Sections 9.4.3, 9.4.4, and 9.4.5, a Debug Status Register (DBSR) described is
Section 9.4.2 for enabling and recording various kinds of debug events, and a spe-
cial Debug interrupt type built into the interrupt mechanism (see Section 7.6.16
on page 172). The debug facilities also provide a mechanism for software-con-
trolled processor reset, and for controlling the operation of the timers in a debug
environment.
Access to the debug facilities using the mfspr and mtspr instructions, as well as
the debug interrupt mechanism, are defined as part of Book E. In addition, imple-
mentations will typically include debug facilities, modes, and access mechanisms
which are implementation-specific and defined as part of the User’s Manual for
the implementation. For example, implementations will typically provide access to
the debug facilities via a dedicated interface such as the IEEE 1149.1 Test Access
Port (JTAG).
Almost all debug exceptions fall into the first category. That is, they all take the inter-
rupt upon encountering an instruction having the exception without updating any
architectural state (other than DBSR, CSRR0, CSRR1, MSR) for that instruction.
The CSRR0 for this type of exception points to the instruction that encountered the
exception. This includes IAC, DAC, branch taken, etc.
The only exception which fall into the second category is the instruction complete debug
exception. This exception is taken upon completing and updating one instruction and
then pointing CSRR0 to the next instruction to execute.
To make forward progress for any Type 1 debug exception one does the following:
1. Software sets up Type 1 exceptions (e.g. branch taken debug exceptions) and then
returns to normal program operation
2. Hardware takes critical debug interrupt upon the first branch taken debug
exception, pointing to the branch with CSRR0.
3. Software, in the debug handler, sees the branch taken exception type, does whatever
logging/analysis it wants to, then clears all debug event enables in the DBCR except
for the instruction complete debug event enable.
5. Hardware would execute and complete one instruction (the branch taken in this
case), and then take a critical debug interrupt with CSRR0 pointing to the target of
the branch.
6. Software would see the instruction complete interrupt type. It clears the instruction
complete event enable, then enables the branch taken interrupt event again.
8. Hardware resumes on the target of the taken branch and continues until another
taken branch, in which case we end up at step 2 again.
This, at first, seems like a double tax (i.e. 2 debug interrupts for every instance of a Type
1 exception), but it doesn't seem like any other clean way to make forward progress on
Type 1 debug exceptions. The only other way to avoid the double tax is to have the
debug handler routine actually emulate the instruction pointed to for the Type 1 excep-
tions, determine the next instruction that would have been executed by the interrupted
program flow and load the CSRR0 with that address and do an rfci; this is probably not
faster.
Debug events include such things as instruction and data breakpoints. These
debug events cause status bits to be set in the Debug Status Register. The exist-
ence of a set bit in the Debug Status Register is considered a Debug exception.
Debug exceptions, if enabled, will cause Debug interrupts.
There are two different mechanisms that control whether Debug interrupts are
enabled. The first is the MSRDE bit, and this bit must be set to 1 to enable Debug
interrupts. The second mechanism is an enable bit in the Debug Control Register
0 (DBCR0). This bit is the Internal Debug Mode bit (DBCR0IDM), and it must also
be set to 1 to enable Debug interrupts.
When DBCR0IDM=1, the processor is in Internal Debug Mode. In this mode, debug
events will (if also enabled by MSRDE) cause Debug interrupts. Software at the
Debug interrupt vector location will thus be given control upon the occurrence of
a debug event, and can access (via the normal instructions) all architected proces-
sor resources. In this fashion, debug monitor software can control the processor
and gather status, and interact with debugging hardware connected to the proces-
sor.
When the processor is not in Internal Debug Mode (DBCR0IDM=0), debug events
may still occur and be recorded in the Debug Status Register. These exceptions
may be monitored via software by reading the Debug Status Register (using
mfspr), or may eventually cause a Debug interrupt if later enabled by setting
DBCR0IDM=1 (and MSRDE=1). Processor behavior when debug events occur while
DBCR0IDM=0 is implementation-dependent. The remainder of this chapter dis-
cusses processor behavior with the presumption that DBCR0IDM=1.
Debug events are used to cause Debug exceptions to be recorded in the Debug
Status Register (see Table 9-4 on page 217). In order for a debug event to be
enabled to set a Debug Status Register bit and thereby cause a Debug exception,
the specific event type must be enabled by a corresponding bit or bits in the
Debug Control Registers (DBCR0 defined in Table 9-1, DBCR1 defined in Table 9-
2, or DBCR2 defined in Table 9-3) (in most cases; the Unconditional Debug Event
(UDE) is an exception to this rule). Once a Debug Status Register bit is set, if
Debug interrupts are enabled by MSRDE, a Debug interrupt will be generated.
Certain debug events are not allowed to occur when MSRDE=0. In such situations,
no Debug exception occurs and thus no Debug Status Register bit is set. Other
debug events may cause Debug exceptions and set Debug Status Register bits
regardless of the state of MSRDE. The associated Debug interrupts that result
from such Debug exceptions will be delayed until MSRDE=1, provided the excep-
tions have not been cleared from the Debug Status Register in the meantime.
Any time that a Debug Status Register bit is allowed to be set while MSRDE=0, a
special Debug Status Register bit, Imprecise Debug Event (DBSRIDE), will also be
set. DBSRIDE indicates that the associated Debug exception bit in the Debug Sta-
tus Register was set while Debug interrupts were disabled via the MMSRDE bit.
Debug interrupt handler software can use this bit to determine whether the
address recorded in the Critical Save/Restore Register 0 should be interpreted as
the address associated with the instruction causing the Debug exception, or sim-
Debug interrupts are ordered with respect to other interrupt types (see Section
7.8 on page 174). Debug exceptions are prioritized with respect to other excep-
tions (see Section 7.9 on page 178).
One or more Instruction Address Compare debug events (IAC1, IAC2, IAC3 or
IAC4) occur if they are enabled and execution is attempted of an instruction at an
address that meets the criteria specified in the DBCR0, DBCR1, IAC1, IAC2, IAC3,
and IAC4 Registers.
DBCR1IAC2US specifies whether IAC2 debug events can occur in user mode or
supervisor mode, or both.
DBCR1IAC3US specifies whether IAC3 debug events can occur in user mode or
supervisor mode, or both.
DBCR1IAC4US specifies whether IAC4 debug events can occur in user mode or
supervisor mode, or both.
DBCR1IAC34M specifies whether all or some of the bits of the address of the
instruction fetch must match the contents of the IAC3 Register or IAC4
Register, whether the address must be inside a specific range specified by the
IAC3 Register and IAC4 Register or outside a specific range specified by the
IAC3 Register and IAC4 Register for an IAC3 or IAC4 debug event to occur.
For IAC3 and IAC4 debug events, if the address of the instruction fetch,
ANDed with the contents of the Instruction Address Compare 4 Register,
are equal to the contents of the Instruction Address Compare 3 Register,
also ANDed with the contents of the Instruction Address Compare 4
Register, an instruction address match occurs.
For IAC3 and IAC4 debug events, if the 64-bit address of the instruction
fetch is greater than or equal to the contents of the Instruction Address
Compare 3 Register and less than the contents of the Instruction Address
Compare 4 Register, an instruction address match occurs.
For IAC3 and IAC4 debug events, if the 64-bit address of the instruction
fetch is less than the contents of the Instruction Address Compare 3
Register or greater than or equal to the contents of the Instruction Address
Compare 4 Register, an instruction address match occurs.
See Table 9-1 on page 210 and Table 9-2 on page 212 for a detailed description of
DBCR0 and DBCR1 and the modes of for detecting IAC1, IAC2, IAC3 and IAC4
If MSRDE=1 (i.e. Debug interrupts are enabled) at the time of the Instruction
Address Compare debug exception, a Debug interrupt will occur immediately (pro-
vided there exists no higher priority exception which is enabled to cause an inter-
rupt). The execution of the instruction causing the exception will be suppressed,
and Critical Save/Restore Register 0 will be set to the address of the excepting
instruction.
If MSRDE=0 (i.e. Debug interrupts are disabled) at the time of the Instruction
Address Compare debug exception, a Debug interrupt will not occur, and the
instruction will complete execution (provided the instruction is not causing some
other exception which will generate an enabled interrupt).
Later, if the debug exception has not been reset by clearing DBSRIAC1, DBSRIAC2,
DBSRIAC3, and DBSRIAC4, and MSRDE is set to 1, a delayed Debug interrupt will
occur. In this case, Critical Save/Restore Register 0 will contain the address of the
instruction after the one which enabled the Debug interrupt by setting MSRDE to
1. Software in the Debug interrupt handler can observe DBSRIDE to determine
how to interpret the value in Critical Save/Restore Register 0.
One or more Data Address Compare debug events (DAC1R, DAC1W, DAC2R,
DAC2W) occur if they are enabled, execution is attempted of a data storage access
instruction, and the type, address, and possibly even the data value of the data
storage access meet the criteria specified in the Debug Control Register 0, Debug
Control Register 2, and the DAC1, DAC2, DVC1, and DVC2 registers.
All Load instructions are considered reads with respect to debug events, while
all Store instructions are considered writes with respect to debug events. In
addition, the Cache Management instructions, and certain special cases, are
handled as follows.
– dcbt[e], dcbtst[e], icbt[e], and icbi[e] are all considered reads with respect
to debug events. Note that dcbt[e], dcbtst[e], and icbt[e] are treated as no-
operations when they report Data Storage or Data TLB Miss exceptions,
instead of being allowed to cause interrupts. However, these instructions
are allowed to cause Debug interrupts, even when they would otherwise
have been no-op’ed due to a Data Storage or Data TLB Miss exception.
– dcbz[e], dcbi[e], dcbf[e], dcba[e], and dcbst[e] are all considered writes
Engineering Note
dcba[e], dcbt[e], dcbtst[e], and icbt[e] may cause a Data Address Compare debug
event even when they are otherwise being no-op’ed due to causing a Data Storage or
Data TLB Miss exception. However, signalling a Debug exception is not required in
these cases.
Indexed-string instructions (lswx, stswx) for which the XER field specifies
zero bytes as the length of the string are treated as no-ops, and are not
allowed to cause Data Address Compare debug events.
DBCR2DAC2US specifies whether DAC2R and DAC2W debug events can occur
in user mode or supervisor mode, or both.
DBCR2DVC2M and DBCR2DVC2BE specify whether and how the data value
being accessed by the storage access must match the contents of the Data
Value Compare 2 Register for a DAC2R or DAC2W debug event to occur.
See Table 9-1 on page 210 and Table 9-3 on page 215 for a detailed description of
DBCR0 and DBCR2 and the modes for detecting Data Address Compare debug
events. Data Address Compare debug events can occur regardless of the setting of
MSRDE or DBCR0IDM.
If MSRDE=1 (i.e. Debug interrupts are enabled) at the time of the Data Address
Compare debug exception, a Debug interrupt will occur immediately (provided
there exists no higher priority exception which is enabled to cause an interrupt),
the execution of the instruction causing the exception will be suppressed, and
Critical Save/Restore Register 0 will be set to the address of the excepting instruc-
tion. Depending on the type of instruction and/or the alignment of the data
access, the instruction causing the exception may have been partially executed
(see Section 7.7 on page 173).
If MSRDE=0 (i.e. Debug interrupts are disabled) at the time of the Data Address
Compare debug exception, a Debug interrupt will not occur, and the instruction
will complete execution (provided the instruction is not causing some other excep-
tion which will generate an enabled interrupt). Also, DBSRIDE is set to indicate
that the debug exception occurred while Debug interrupts were disabled by
MSRDE=0.
Later, if the debug exception has not been reset by clearing DBSRDAC1R,
DBSRDAC1W, DBSRDAC2R, DBSRDAC2W, and MSRDE is set to 1, a delayed Debug
interrupt will occur. In this case, Critical Save/Restore Register 0 will contain the
address of the instruction after the one which enabled the Debug interrupt by set-
ting MSRDE to 1. Software in the Debug interrupt handler can observe DBSRIDE to
determine how to interpret the value in Critical Save/Restore Register 0.
A Trap debug event (TRAP) occurs if DBCR0TRAP=1 (i.e. Trap debug events are
enabled) and a Trap instruction (tw, twi, td, tdi) is executed and the conditions
specified by the instruction for the trap are met. The event can occur regardless of
the setting of MSRDE or DBCR0IDM.
When a Trap debug event occurs, DBSR TR is set to 1 to record the debug excep-
tion. If MSRDE=0, DBSRIDE is also set to 1 to record the imprecise debug event.
If MSRDE=0 (i.e. Debug interrupts are disabled) at the time of the Trap debug
exception, a Debug interrupt will not occur, and a Trap exception type Program
interrupt will occur instead if the trap condition is met.
Later, if the debug exception has not been reset by clearing DBSR TR, and MSRDE
is set to 1, a delayed Debug interrupt will occur. In this case, Critical Save/
Restore Register 0 will contain the address of the instruction after the one which
enabled the Debug interrupt by setting MSRDE to 1. Software in the debug inter-
rupt handler can observe DBSRIDE to determine how to interpret the value in Crit-
ical Save/Restore Register 0.
A Branch Taken debug event (BRT) occurs if DBCR0BRT=1 (i.e. Branch Taken
Debug events are enabled), execution is attempted of a branch instruction whose
direction will be taken (that is, either an unconditional branch, or a conditional
branch whose branch condition is met), and MSRDE=1.
Branch Taken debug events are not recognized if MSRDE=0 at the time of the exe-
cution of the branch instruction and thus DBSRIDE can not be set by a Branch
Taken debug event. This is because branch instructions occur very frequently.
Allowing these common events to be recorded as exceptions in the DBSR while
debug interrupts are disabled via MSRDE would result in an inordinate number of
imprecise Debug interrupts.
When a Branch Taken debug event occurs, the DBSRBRT bit is set to 1 to record
the debug exception and a Debug interrupt will occur immediately (provided there
exists no higher priority exception which is enabled to cause an interrupt). The
execution of the instruction causing the exception will be suppressed, and Critical
Save/Restore Register 0 will be set to the address of the excepting instruction.
Instruction Complete debug events are not recognized if MSRDE=0 at the time of
the execution of the instruction, DBSRIDE can not be set by an ICMP debug event.
This is because allowing the common event of Instruction Completion to be
recorded as an exception in the DBSR while Debug interrupts are disabled via
MSRDE would mean that the Debug interrupt handler software would receive an
Only non-critical class interrupts can cause an Interrupt Taken debug event
because all critical interrupts automatically clear MSRDE, and thus would always
prevent the associated Debug interrupt from occurring precisely. Also, Debug
interrupts themselves are critical class interrupts, and thus any Debug interrupt
(for any other debug event) would always end up setting the additional exception
of DBSRIRPT upon entry to the Debug interrupt handler. At this point, the Debug
interrupt handler would be unable to determine whether or not the Interrupt
Taken debug event was related to the original debug event.
When an Interrupt Taken debug event occurs, DBSRIRPT is set to 1 to record the
debug exception. If MSRDE=0, DBSRIDE is also set to 1 to record the imprecise
debug event.
If MSRDE=1 (i.e. Debug interrupts are enabled) at the time of the Interrupt Taken
debug event, a Debug interrupt will occur immediately (provided there exists no
higher priority exception which is enabled to cause an interrupt), and Critical
Save/Restore Register 0 will be set to the address of the non-critical interrupt vec-
tor which caused the Interrupt Taken debug event. No instructions at the non-
critical interrupt handler will have been executed.
If MSRDE=0 (i.e. Debug interrupts are disabled) at the time of the Interrupt Taken
debug event, a Debug interrupt will not occur, and the handler for the interrupt
which caused the Interrupt Taken debug event will be allowed to execute.
Later, if the debug exception has not been reset by clearing DBSRIRPT, and MSRDE
is set to 1, a delayed Debug interrupt will occur. In this case, CSRR0 will contain
the address of the instruction after the one which enabled the Debug interrupt by
setting MSRDE to 1. Software in the Debug interrupt handler can observe the
DBSRIDE bit to determine how to interpret the value in Critical Save/Restore Reg-
ister 0.
When a Return debug event occurs, DBSRRET is set to 1 to record the debug
exception. If MSRDE=0, DBSRIDE is also set to 1 to record the imprecise debug
event.
If MSRDE=0 at the time of the Return Debug event, a Debug interrupt will not
occur.
Later, if the Debug exception has not been reset by clearing DBSRRET, and MSRDE
is set to 1, a delayed imprecise Debug interrupt will occur. In this case, Critical
Save/Restore Register 0 will contain the address of the instruction after the one
which enabled the Debug interrupt by setting MSRDE to 1. An imprecise Debug
interrupt can be caused by executing an rfi when DBCR0RET=1 and MSRDE=0,
and the execution of that rfi happens to cause MSRDE to be set to 1. Software in
the Debug interrupt handler can observe the DBSRIDE bit to determine how to
interpret the value in Critical Save/Restore Register 0.
An Unconditional debug event (UDE) occurs when the Unconditional Debug Event
(UDE) signal is activated by the debug mechanism. The exact definition of the
UDE signal and how it is activated is implementation-dependent (see the User’s
Manual for the implementation for more details). The Unconditional debug event
is the only debug event which does not have a corresponding enable bit for the
event in DBCR0 (hence the name of the event). The Unconditional debug event
can occur regardless of the setting of MSRDE.
When an Unconditional debug event occurs, the DBSRUDE bit is set to 1 to record
the Debug exception. If MSRDE=0, DBSRIDE is also set to 1 to record the imprecise
debug event.
If MSRDE=1 (i.e. Debug interrupts are enabled) at the time of the Unconditional
Debug exception, a Debug interrupt will occur immediately (provided there exists
no higher priority exception which is enabled to cause an interrupt), and Critical
Save/Restore Register 0 will be set to the address of the instruction which would
have executed next had the interrupt not occurred.
If MSRDE=0 (i.e. Debug interrupts are disabled) at the time of the Unconditional
Debug exception, a Debug interrupt will not occur.
Later, if the Unconditional Debug exception has not been reset by clearing
DBSRUDE, and MSRDE is set to 1, a delayed Debug interrupt will occur. In this
case, CSRR0 will contain the address of the instruction after the one which
enabled the Debug interrupt by setting MSRDE to 1. Software in the Debug inter-
rupt handler can observe DBSRIDE to determine how to interpret the value in Crit-
ical Save/Restore Register 0.
This section describes debug-related registers that are accessible to software run-
ning on the processor. These registers are intended for use by special debug tools
and debug software, and not by general application or operating system code.
Bit(s) Description
32 Allocated for implementation-dependent use. See the User’s Manual for the imple-
mentation for details.
33 Internal Debug Mode (IDM)
=0 Debug interrupts are disabled.
=1 If MSRDE=1, then the occurrence of a debug event or the recording of an
earlier debug event in the Debug Status Register when MSRDE=0 or
DBCR0IDM=0 will cause a Debug interrupt.
Programming Note
Software must clear debug event status in the Debug Status Register in the Debug
interrupt handler when a Debug interrupt is taken before re-enabling interrupts
via MSRDE. Otherwise, redundant Debug interrupts will be taken for the same de-
bug event.
34:35 Reset (RST)
=00 No action
=01 See the User’s Manual for the implementation for details.
=10 See the User’s Manual for the implementation for details.
=11 See the User’s Manual for the implementation for details.
Warning
Writing 0b01, 0b10, or 0b11 to these bits may cause a processor reset to occur.
36 Instruction Completion Debug Event (ICMP)
=0 ICMP debug events are disabled
=1 ICMP debug events are enabled
Note
Instruction Completion will not cause an ICMP debug event if MSRDE=0.
Bit(s) Description
32:33 Instruction Address Compare 1 User/Supervisor Mode (IAC1US)
=00 IAC1 debug events can occur
=01 Reserved
=10 IAC1 debug events can occur only if MSRPR=0
=11 IAC1 debug events can occur only if MSRPR=1
34:35 Instruction Address Compare 1 Effective/Real Mode (IAC1ER)
=00 IAC1 debug events are based on effective addresses
=01 IAC1 debug events are based on real addresses
=10 IAC1 debug events are based on effective addresses and can occur only if
MSRIS=0
=11 IAC1 debug events are based on effective addresses and can occur only if
MSRIS=1
36:37 Instruction Address Compare 2 User/Supervisor Mode (IAC2US)
=00 IAC2 debug events can occur
=01 Reserved
=10 IAC2 debug events can occur only if MSRPR=0
=11 IAC2 debug events can occur only if MSRPR=1
38:39 Instruction Address Compare 2 Effective/Real Mode (IAC2ER)
=00 IAC2 debug events are based on effective addresses
=01 IAC2 debug events are based on real addresses
=10 IAC2 debug events are based on effective addresses and can occur only if
MSRIS=0
=11 IAC2 debug events are based on effective addresses and can occur only if
MSRIS=1
40:41 Instruction Address Compare 1/2 Mode (IAC12M)
=00 Exact address compare
• IAC1 debug events can occur only if the address of the instruction fetch
is equal to the value specified in IAC1.
• IAC2 debug events can occur only if the address of the instruction fetch
is equal to the value specified in IAC2.
=01 Address bit match
• IAC1 and IAC2 debug events can occur only if the address of the instruc-
tion fetch, ANDed with the contents of IAC2 are equal to the contents of
IAC1, also ANDed with the contents of IAC2.
• IAC1 and IAC2 debug events can occur only if the address of the instruc-
tion fetch is greater than or equal to the value specified in IAC1 and less
than the value specified in IAC2.
• IAC1 and IAC2 debug events can occur only if the address of the instruc-
tion fetch is less than the value specified in IAC1 or is greater than or
equal to the value specified in IAC2.
• IAC3 debug events can occur only if the address of the instruction fetch
is equal to the value specified in IAC3.
• IAC4 debug events can occur only if the address of the instruction fetch
is equal to the value specified in IAC4.
=01 Address bit match
• IAC3 and IAC4 debug events can occur only if the address of the data
storage access, ANDed with the contents of IAC4 are equal to the con-
tents of IAC3, also ANDed with the contents of IAC4.
• IAC3 and IAC4 debug events can occur only if the address of the instruc-
tion fetch is greater than or equal to the value specified in IAC3 and less
than the value specified in IAC4.
• IAC3 and IAC4 debug events can occur only if the address of the instruc-
tion fetch is less than the value specified in IAC3 or is greater than or
equal to the value specified in IAC4.
Bit(s) Description
32:33 Data Address Compare 1 User/Supervisor Mode (DAC1US)
=00 DAC1 debug events can occur
=01 Reserved
=10 DAC1 debug events can occur only if MSRPR=0
=11 DAC1 debug events can occur only if MSRPR=1
34:35 Data Address Compare 1 Effective/Real Mode (DAC1ER)
=00 DAC1 debug events are based on effective addresses
=01 DAC1 debug events are based on real addresses
=10 DAC1 debug events are based on effective addresses and can occur only if
MSRDS=0
=11 DAC1 debug events are based on effective addresses and can occur only if
MSRDS=1
36:37 Data Address Compare 2 User/Supervisor Mode (DAC2US)
=00 DAC2 debug events can occur
=01 Reserved
=10 DAC2 debug events can occur only if MSRPR=0
=11 DAC2 debug events can occur only if MSRPR=1
38:39 Data Address Compare 2 Effective/Real Mode (DAC2ER)
=00 DAC2 debug events are based on effective addresses
=01 DAC2 debug events are based on real addresses
=10 DAC2 debug events are based on effective addresses and can occur only if
MSRDS=0
=11 DAC2 debug events are based on effective addresses and can occur only if
MSRDS=1
40:41 Data Address Compare 1/2 Mode (DAC12M)
=00 Exact address compare
• DAC1 debug events can occur only if the address of the data storage ac-
cess is equal to the value specified in DAC1.
• DAC2 debug events can occur only if the address of the data storage ac-
cess is equal to the value specified in DAC2.
=01 Address bit match
• DAC1 and DAC2 debug events can occur only if the address of the data
storage access, ANDed with the contents of DAC2 are equal to the con-
tents of DAC1, also ANDed with the contents of DAC2.
• DAC1 and DAC2 debug events can occur only if the address of the data
storage access is greater than or equal to the value specified in DAC1 and
less than the value specified in DAC2.
• DAC1 and DAC2 debug events can occur only if the address of the data
storage access is less than the value specified in DAC1 or is greater than
or equal to the value specified in DAC2.
The Debug Status Register (DBSR) is a 32-bit register and contains status on
debug events and the most recent processor reset. Table 9-4 provides bit defini-
tions for the Debug Status Register.
The Debug Status Register is set via hardware, and read and cleared via software.
The contents of the Debug Status Register can be read into bits 32:63 of a General
Purpose Register using mfspr RT,DBSR, setting bits 0:31 of GPR(RT) to zero. Bits
in the Debug Status Register can be cleared using mtspr DBSR,RS. Clearing is
done by writing bits 32:63 of a General Purpose Register to the Debug Status Reg-
ister with a 1 in any bit position that is to be cleared and 0 in all other bit posi-
tions. The write-data to the Debug Status Register is not direct data, and
DBCR0IAC1=1but a mask. A 1 causes the bit to be cleared, and a 0 has no effect.
Bit(s) Description
32 Imprecise Debug Event (IDE)
Set to 1 if MSRDE=0 and a debug event causes its respective Debug Status Register
bit to be set to 1.
33 Unconditional Debug Event (UDE)
Set to 1 if an Unconditional debug event occurred. See Section 9.3.8 on page 209.
34:35 Most Recent Reset (MRR)
Set to one of three values when a reset occurs. These two bits are undefined at
power-up.
=00 No reset occurred since these bits last cleared by software
=01 Implementation-dependent reset information
=10 Implementation-dependent reset information
=11 Implementation-dependent reset information
Note
See the User’s Manual for the implementation for further details.
36 Instruction Complete Debug Event (ICMP)
Set to 1 if an Instruction Completion debug event occurred and DBCR0ICMP=1.
See Section 9.3.5 on page 207.
37 Branch Taken Debug Event (BRT)
Set to 1 if a Branch Taken debug event occurred and DBCR0BRT=1.
See Section 9.3.4 on page 207.
38 Interrupt Taken Debug Event (IRPT)
Set to 1 if an Interrupt Taken debug event occurred and DBCR0IRPT=1.
See Section 9.3.6 on page 208.
39 Trap Instruction Debug Event (TRAP)
Set to 1 if a Trap Instruction debug event occurred and DBCR0TRAP=1.
See Section 9.3.3 on page 206.
40 Instruction Address Compare 1 Debug Event (IAC1)
Set to 1 if an IAC1 debug event occurred and DBCR0IAC1=1. See Section 9.3.1 on
page 202.
41 Instruction Address Compare 2 Debug Event (IAC2)
Set to 1 if an IAC2 debug event occurred and DBCR0IAC2=1. See Section 9.3.1 on
page 202.
42 Instruction Address Compare 3 Debug Event (IAC3)
Set to 1 if an IAC3 debug event occurred and DBCR0IAC3=1. See Section 9.3.1 on
page 202.
43 Instruction Address Compare 4 Debug Event (IAC4)
Set to 1 if an IAC4 debug event occurred and DBCR0IAC4=1. See Section 9.3.1 on
page 202.
44 Data Address Compare 1 Read Debug Event (DAC1R)
Set to 1 if a read-type DAC1 debug event occurred and DBCR0DAC1=0b10 or
DBCR0DAC1=0b11. See Section 9.3.2 on page 204.
The contents of the Instruction Address Compare i Register (where i={1,2,3, or 4})
can be read into a General Purpose Register using mfspr RT,IACi. The contents of
a General Purpose Register can be written to the Instruction Address Compare i
Register using mtspr IACi,RS.
The Data Address Compare 1 Register (DAC1) and Data Address Compare 2 Reg-
ister (DAC2) are each 64-bits.
A debug event may be enabled to occur upon loads, stores, or cache operations to
an address specified in either the Data Address Compare 1 Register or Data
Address Compare 2 Register, inside or outside a range specified by the Data
Address Compare 1 Register and Data Address Compare 2 Register, or to blocks
of addresses specified by the combination of the Data Address Compare 1 Register
and Data Address Compare 1 Register. See Section 9.3.2 on page 204.
The contents of the Data Address Compare 1 Register or Data Address Compare 2
Register are compared to the address generated by a data storage access instruc-
tion.
The Data Value Compare 1 Register (DVC1) and Data Value Compare 2 Register
(DVC2) are each 64-bits.
A DAC1R, DAC1W, DAC2R, or DAC2W debug event may be enabled to occur upon
loads or stores of a specific data value specified in either or both of the Data Value
Compare 1 Register and Data Value Compare 2 Register. DBCR2DVC1M and
DBCR2DVC1BE control how the contents of the Data Value Compare 1 Register is
compared with the value and DBCR2DVC2M and DBCR2DVC2BE control how the
contents of the Data Value Compare 2 Register is compared with the value. See
Section 9.3.2 on page 204 and Table 9-3 on page 215 for a detailed description of
the modes provided.
The contents of the Data Value Compare i Register (where i={1 or 2}) can be read
into a General Purpose Register using mfspr RT,DVCi. The contents of a General
Purpose Register can be written to the Data Value Compare i Register using mtspr
DVCi,RS.
This chapter describes the requirements for Book E processor reset. This includes
both the means of causing reset, and the specific initialization that is required to
be performed automatically by the processor hardware. This chapter also provides
an overview of the operations that should be performed by initialization software,
in order to fully initialize the processor.
In general, the specific actions taken by a processor upon reset are implementa-
tion dependent, and are described in the User’s Manual for the implementation.
Also, it is the responsibility of system initialization software to initialize the major-
ity of processor and system resources after reset. Implementations are required to
provide a minimum processor initialization such that this system software may be
fetched and executed, thereby accomplishing the rest of system initialization.
The initial processor state is controlled by the register contents after reset. In gen-
eral, the contents of most registers are undefined after reset.
The processor hardware is only guaranteed to initialize those registers (or specific
bits in registers) which must be initialized in order for software to be able to reli-
ably perform the rest of system initialization.
TLB entry
A TLB entry (which entry is implementation-dependent) is initialized in an
implementation-dependent manner that maps the last 4KB page in the
implemented storage address space, with the following field settings:
The notation “CSI” in the tables means any context synchronizing instruction (i.e.,
sc, isync, rfci or rfi). A context synchronizing interrupt (that is, any interrupt
except non-recoverable Machine Check) can be used instead of a context synchro-
nizing instruction. If it is, phrases like “the synchronizing instruction,” below,
should be interpreted as meaning the instruction at which the interrupt occurs. If
no software synchronization is required before (after) a context-altering instruc-
tion, “the synchronizing instruction before (after) the context-altering instruction”
should be interpreted as meaning the context-altering instruction itself.
Programming Note
Sometimes advantage can be taken of the fact that certain instructions that occur natu-
rally in the program, such as the rfi/rfci at the end of an interrupt handler, provide the
required synchronization.
Table 11-1 below identifies the software synchronization requirements for data
access for all context-altering instructions.
Required Required
Context Altering Instruction or Event Notes
Before After
interrupt none none
rfi none none
rfci none none
sc none none
mtmsr (PR) none CSI
mtmsr (ME) none CSI 1
mtmsr (DS) none CSI
mtspr (DAC1, DAC2, DVC1, DVC2) — — 4
mtspr (DBCR0, DBCR2) — — 4
mtspr (DBSR) — — 4
mtspr (PID) CSI CSI
tlbiva[e] CSI CSI or msync 5,6
tlbwe CSI CSI or msync 5,6
Table 11-2 below identifies the software synchronization requirements for instruc-
tion fetch and/or execution for all context-altering instructions.
Required Required
Context Altering Instruction or Event Notes
Before After
interrupt none none
rfi none none
rfci none none
sc none none
mtmsr (WE) — — 2
mtmsr (CE) none none 3
mtmsr (EE) none none 3
wrtee none none 3
wrteei none none 3
mtmsr (PR) none CSI
mtmsr (FP) none CSI
mtmsr (ME) none CSI 1
mtmsr (FE0) none CSI
mtmsr (FE1) none CSI
mtmsr (DE) none CSI
mtspr (IVPR) none none
mtspr (IVORi) none none
mtmsr (IS) none CSI 7
mtspr (PID) none CSI 7
mtspr (DEC) none none 8
mtspr (TCR) none none 8
mtspr (TSR) none none 8
mtspr (IAC1, IAC2, IAC3, IAC4) — — 4
mtspr (DBCR0, DBCR1) — — 4
mtspr (DBSR) — — 4
tlbiva[e] none CSI or msync 5,6
tlbwe none CSI or msync 5,6
2. Synchronization requirements for changing the Wait State Enable are imple-
mentation-dependent, and are specified in the User’s Manual for the
implementation.
If an mtmsr, wrtee, or wrteei instruction changes MSREE from ‘0’ to ‘1’ when
an External Input, Decrementer, Fixed-Interval Timer, or higher priority
enabled exception exists, the corresponding interrupt occurs immediately
after the mtmsr, wrtee, or wrteei is executed, and before the next instruc-
tion is executed in the program that set MSREE to ‘1’.
5. For data accesses, the context synchronizing instruction before the tlbwe or
tlbiva[e] instruction ensures that all storage accesses due to preceding
instructions have completed to a point at which they have reported all excep-
tions they will cause.
Programming Note
The following sequence illustrates why it is necessary, for data accesses, to ensure
that all storage accesses due to instructions before the tlbwe or tlbiva[e] have com-
pleted to a point at which they have reported all exceptions they will cause. Assume
that valid TLB entries exist for the target storage location when the sequence starts.
3. The Load or Store instruction finally executes, and gets a TLB Miss exception.
The TLB Miss exception is semantically incorrect. In order to prevent it, a context
synchronizing instruction must be executed between steps 1 and 2.
7. The alteration must not cause an implicit branch in real address space. Thus
the real address of the context-altering instruction and of each subsequent
instruction, up to and including the next context synchronizing instruction,
must be independent of whether the alteration has taken effect.
8. The elapsed time between the Decrementer reaching zero, or the transition of
the selected Time Base bit for the Fixed-Interval Timer or the Watchdog Timer,
and the signalling of the Decrementer, Fixed-Interval Timer or the Watchdog
Timer exception is not defined.
Add
The sum of the contents of GPR(RA) and the contents of GPR(RB) is placed into
GPR(RT).
The sum of the contents of GPR(RA) and the contents of GPR(RB) is placed into
GPR(RT).
For adde[o][.], the sum of the contents of GPR(RA), the contents of GPR(RB), and
CA is placed into GPR(RT).
For adde64[o], the sum of the contents of GPR(RA), the contents of GPR(RB), and
CA64 is placed into GPR(RT).
If addi and RA=0, the sign-extended value of the SI field is placed into GPR(RT).
If addi and RA≠0, the sum of the contents of GPR(RA) and the sign-extended
value of field SI is placed into GPR(RT).
If addis and RA=0, the sign-extended value of the SI field, concatenated with 16
zeros, is placed into GPR(RT).
If addis and RA≠0, the sum of the contents of GPR(RA) and the sign-extended
value of the SI field concatenated with 16 zeros, is placed into GPR(RT).
The sum of the contents of GPR(RA) and the sign-extended value of the SI field is
placed into GPR(RT).
For addme[o][.], the sum of the contents of GPR(RA), CA, and 641 is placed into
GPR(RT).
For addme64[o], the sum of the contents of GPR(RA), CA64, and 641 is placed
into GPR(RT).
For addze[o][.], the sum of the contents of GPR(RA) and CA is placed into
GPR(RT).
For addze64[o], the sum of the contents of GPR(RA) and CA64 is placed into
GPR(RT).
For andi., the contents of GPR(RS) are ANDed with 480 || UI.
For andis., the contents of GPR(RS) are ANDed with 320 || UI || 160.
For and[.], the contents of GPR(RS) are ANDed with the contents of GPR(RB).
For andc[.], the contents of GPR(RS) are ANDed with the one’s complement of the
contents of GPR(RB).
• For b[l][a], let BTEA be 32 0s concatenated with bits 32:63 of the sum of the
current instruction address (CIA), or 64 0s if AA=1, and the sign-extended
value of the LI instruction field concatenated with 0b00.
• For be[l][a], let BTEA be the sum of the current instruction address (CIA), or
64 0s if AA=1, and the sign-extended value of the LI instruction field concate-
nated with 0b00.
• For bc[l][a], let BTEA be 32 0s concatenated with bits 32:63 of the sum of the
current instruction address (CIA), or 64 0s if AA=1, and the sign-extended
value of the BD instruction field concatenated with 0b00.
• For bce[l][a], let BTEA be the sum of the current instruction address (CIA), or
64 0s if AA=1, and the sign-extended value of the BD instruction field concat-
enated with 0b00.
The BO field of the instruction specifies the condition or conditions that must be
met in order for the branch to be taken, as defined in Section 3.3 on page 49. The
sum BI+32 specifies the bit of the Condition Register that is to be used.
If the branch conditions are met, the BTEA is the address of the next instruction
to be executed.
• For bcctr[l], let BTEA be 32 0s concatenated with the contents of bits 32:61 of
the Count Register concatenated with 0b00.
• For bcctre[l], let BTEA be the contents of bits 0:61 of the Count Register con-
catenated with 0b00.
The BO field of the instruction specifies the condition or conditions that must be
met in order for the branch to be taken, as defined in Section 3.3 on page 49. The
sum BI+32 specifies the bit of the Condition Register that is to be used.
If the branch condition is met, the BTEA is the address of the next instruction to
be executed.
If the ‘decrement and test CTR’ option is specified (BO2=0), the instruction form is
invalid.
• For bclr[l], let BTEA be 32 0s concatenated with the contents of bits 32:61 of
the Link Register concatenated with 0b00.
• For bclre[l], let BTEA be the contents of bits 0:61 of the Link Register concat-
enated with 0b00.
The BO field of the instruction specifies the condition or conditions that must be
met in order for the branch to be taken, as defined in Section 3.3 on page 49. The
sum BI+32 specifies the bit of the Condition Register that is to be used.
If the branch condition is met, the BTEA is the address of the next instruction to
be executed.
cmp BF,L,RA,RB
0 1 1 1 1 1 BF / L RA RB 0 0 0 0 0 0 0 0 0 0 /
0 6 9 10 11 16 21 31
cmpi BF,L,RA,SI
0 0 1 0 1 1 BF / L RA SI
0 6 9 10 11 16 31
If cmp and L=0, the contents of bits 32:63 of GPR(RA) are compared with the con-
tents of bits 32:63 of GPR(RB), treating the operands as signed integers.
If cmp and L=1, the contents of GPR(RA) are compared with the contents of
GPR(RB), treating the operands as signed integers.
If cmpi and L=0, the contents of bits 32:63 of GPR(RA) are compared with the
sign-extended value of the SI field, treating the operands as signed integers.
If cmpi and L=1, the contents of GPR(RA) are compared with the sign-extended
value of the SI field, treating the operands as signed integers.
cmpl BF,L,RA,RB
0 1 1 1 1 1 BF / L RA RB 0 0 0 0 1 0 0 0 0 0 /
0 6 9 10 11 16 21 31
cmpli BF,L,RA,UI
0 0 1 0 1 0 BF / L RA UI
0 6 9 10 11 16 31
If cmpl and L=0, the contents of bits 32:63 of GPR(RA) are compared with the con-
tents of bits 32:63 of GPR(RB), treating the operands as unsigned integers.
If cmpl and L=1, the contents of GPR(RA) are compared with the contents of
GPR(RB), treating the operands as unsigned integers.
If cmpli and L=0, the contents of bits 32:63 of GPR(RA) are compared with the
zero-extended value of the UI field, treating the operands as unsigned integers.
If cmpli and L=1, the contents of GPR(RA) are compared with the zero-extended
value of the UI field, treating the operands as unsigned integers.
For cntlzw[.], a count of the number of consecutive zero bits starting at bit 32 of
the contents of GPR(RS) is placed into GPR(RA). This number ranges from 0 to 32,
inclusive. If Rc=1, CR Field 0 is set to reflect the result.
For cntlzd, a count of the number of consecutive zero bits starting at bit 0 of the
contents of GPR(RS) is placed into GPR(RA). This number ranges from 0 to 64,
inclusive. If Rc=1, the instruction form is invalid.
The content of bit BA+32 of the Condition Register is ANDed with the content of
bit BB+32 of the Condition Register, and the result is placed into bit BT+32 of the
Condition Register.
crandc BT,BA,BB
0 1 0 0 1 1 BT BA BB 0 0 1 0 0 0 0 0 0 1 /
0 6 11 16 21 31
The content of bit BA+32 of the Condition Register is ANDed with the one’s com-
plement of the content of bit BB+32 of the Condition Register, and the result is
placed into bit BT+32 of the Condition Register.
creqv BT,BA,BB
0 1 0 0 1 1 BT BA BB 0 1 0 0 1 0 0 0 0 1 /
0 6 11 16 21 31
The content of bit BA+32 of the Condition Register is XORed with the content of
bit BB+32 of the Condition Register, and the one’s complement of result is placed
into bit BT+32 of the Condition Register.
crnand BT,BA,BB
0 1 0 0 1 1 BT BA BB 0 0 1 1 1 0 0 0 0 1 /
0 6 11 16 21 31
The content of bit BA+32 of the Condition Register is ANDed with the content of
bit BB+32 of the Condition Register, and the one’s complement of the result is
placed into bit BT+32 of the Condition Register.
crnor BT,BA,BB
0 1 0 0 1 1 BT BA BB 0 0 0 0 1 0 0 0 0 1 /
0 6 11 16 21 31
The content of bit BA+32 of the Condition Register is ORed with the content of bit
BB+32 of the Condition Register, and the one’s complement of the result is placed
into bit BT+32 of the Condition Register.
Condition Register OR
cror BT,BA,BB
0 1 0 0 1 1 BT BA BB 0 1 1 1 0 0 0 0 0 1 /
0 6 11 16 21 31
The content of bit BA+32 of the Condition Register is ORed with the content of bit
BB+32 of the Condition Register, and the result is placed into bit BT+32 of the
Condition Register.
crorc BT,BA,BB
0 1 0 0 1 1 BT BA BB 0 1 1 0 1 0 0 0 0 1 /
0 6 11 16 21 31
The content of bit BA+32 of the Condition Register is ORed with the one’s comple-
ment of the content of bit BB+32 of the Condition Register, and the result is
placed into bit BT+32 of the Condition Register.
crxor BT,BA,BB
0 1 0 0 1 1 BT BA BB 0 0 1 1 0 0 0 0 0 1 /
0 6 11 16 21 31
The content of bit BA+32 of the Condition Register is XORed with the content of
bit BB+32 of the Condition Register, and the result is placed into bit BT+32 of the
Condition Register.
This instruction is a hint that performance will probably be improved if the block
containing the byte addressed by EA is established in the data cache without
fetching the block from main storage, because the program will probably soon
store into a portion of the block and the contents of the rest of the block are not
meaningful to the program. If the hint is honored, the contents of the block are
undefined when the instruction completes. The hint is ignored if the block is
Caching Inhibited.
This instruction is treated as a Store (see Section 6.2.4.4 and Section 6.3.2),
except that an interrupt is not taken for a translation or protection violation.
This instruction may establish a block in the data cache without verifying that the
associated real address is valid. This can cause a delayed Machine Check inter-
rupt, as described in Section 7.4.4, “Machine Check Interrupts,” on page 151.
Engineering Note
If the target block is already in the data cache, leaving the contents of the block unmodi-
fied may provide the best performance, especially if the block is Write Through Required.
However, setting the contents of the block to zero may be easier to implement, because
of the similarity to dcbz[e].
If the target block is not already in the data cache and the block is Write Through
Required, ignoring the hint may provide the best performance.
If a dcba[e] causes the target block to be newly established in the data cache, proces-
sors must set all bytes of the block to zero, and all processors must treat the access as a
Store. In particular, if the newly established block is Write Through Required, the con-
tents of the cache block must be written to main storage.
Architecture Note
dcba[e] setting all bytes of newly established cache blocks to zero prevents a program
executing the dcba[e] from reading the preexisting contents of the block, which may
include data that the program is not authorized to read. Such prevention is a require-
ment of secure systems.
If the block containing the byte addressed by EA is in storage that is not Memory
Coherence Required, a block containing the byte addressed by EA is in the data
cache of this processor, and any locations in the block are considered to be modi-
fied there, then those locations are written to main storage. Additional locations in
the block may also be written to main storage. The block is invalidated in the data
cache of this processor.
This instruction is treated as a Load (see Section 6.2.4.4 and Section 6.3.2).
If the block containing the byte addressed by EA is in storage that is not Memory
Coherence Required and a block containing the byte addressed by EA is in the
data cache of this processor, then the block is invalidated in that data cache. On
some implementations, before the block is invalidated, if any locations in the
block are considered to be modified in that data cache, those locations are written
to main storage and additional locations in the block may be written to main stor-
age.
This instruction is treated as a Store (see Section 6.2.4.4 and Section 6.3.2) on
implementations that invalidate a block without first writing to main storage all
locations in the block that are considered to be modified in the data cache, except
that the invalidation is not ordered by mbar. On other implementations this
instruction is treated as a Load (see the section cited above).
• The data cache block size for dcbi[e] is the same as for dcbf[e].
dcbi[e] may cause a cache locking exception. See the User’s Manual for the imple-
mentation.
If the block containing the byte addressed by EA is in storage that is not Memory
Coherence Required and a block containing the byte addressed by EA is in the
data cache of this processor and any locations in the block are considered to be
modified there, those locations are written to main storage. Additional locations in
the block may be written to main storage. The block ceases to be considered to be
modified in that data cache.
This instruction is treated as a Load (see Section 6.2.4.4 and Section 6.3.2).
This instruction is treated as a Load (see Section 6.2.4.4 and Section 6.3.2),
except that an interrupt is not taken for a translation or protection violation.
Engineering Note
Programs are likely to execute dcbt[e] for several blocks before executing Load or Store
instructions that refer to the first of these blocks. Implementations on which dcbt[e]
fetches the block into a separate buffer rather than directly into the data cache should
provide buffer space sufficient for this use.
This instruction is treated as a Load (see Section 6.2.4.4 and Section 6.3.2),
except that an interrupt is not taken for a translation or protection violation.
Engineering Note
Executing dcbtst[e] does not cause the specified block to be considered to be modified in
the data cache.
If the block containing the byte addressed by EA is in the data cache, all bytes of
the block are set to zero.
If the block containing the byte addressed by EA is not in the data cache and is in
storage that is not Caching Inhibited, the block is established in the data cache
without fetching the block from main storage, and all bytes of the block are set to
zero.
This instruction may establish a block in the data cache without verifying that the
associated real address is valid. This can cause a delayed Machine Check inter-
rupt, as described in Section 7.4.4, “Machine Check Interrupts,” on page 151.
This instruction is treated as a Store (see Section 6.2.4.4 and Section 6.3.2).
This instruction may cause a cache locking exception. See the User’s Manual for
the implementation.
Programming Note
If the block containing the byte addressed by EA is in storage that is Caching Inhibited
or Write Through Required, the Alignment interrupt handler should set to zero all bytes
of the area of main storage that corresponds to the addressed block.
dividend0:63 ← GPR(RA)
divisor0:63 ← GPR(RB)
if OE=1 then do
OV64 ← ( (GPR(RA)=-263) & (GPR(RB)=-1) ) | (GPR(RB)=0)
SO64 ← SO64 | OV64
GPR(RT) ← dividend ÷ divisor
The 64-bit quotient of the contents of GPR(RA) divided by the contents of GPR(RB)
is placed into GPR(RT). The remainder is not supplied as a result.
Both operands and the quotient are interpreted as signed integers. The quotient is
the unique signed integer that satisfies
where 0 ≤ r < |divisor| if the dividend is nonnegative, and –|divisor| < r ≤ 0 if the
dividend is negative.
0x8000_0000_0000_0000 ÷ -1
<anything> ÷ 0
then the contents of GPR(RT) are undefined. In these cases, if OE=1 then OV is set
to 1.
Programming Note
The 64-bit signed remainder of dividing GPR(RA) by GPR(RB) can be computed as fol-
lows, except in the case that GPR(RA) = –263 and GPR(RB) = –1.
dividend0:63 ← GPR(RA)
divisor0:63 ← GPR(RB)
quotient0:63 ← dividend ÷ divisor
if OE=1 then do
OV64 ← (GPR(RB)=0)
SO64 ← SO64 | OV64
GPR(RT) ← quotient
The 64-bit quotient of the contents of GPR(RA) divided by the contents of GPR(RB)
is placed into GPR(RT). The remainder is not supplied as a result.
Both operands and the quotient are interpreted as unsigned integers. The quo-
tient is the unique unsigned integer that satisfies
<anything> ÷ 0
then the contents of GPR(RT) are undefined. In this case, if OE=1 then OV is set to
1.
Programming Note
The 64-bit unsigned remainder of dividing GPR(RA) by GPR(RB) can be computed as
follows.
dividend0:31 ← GPR(RA)32:63
divisor0:31 ← GPR(RB)32:63
quotient0:31 ← dividend ÷ divisor
if OE=1 then do
OV ← ( (GPR(RA)32:63=-231) & (GPR(RB)32:63=-1) ) | (GPR(RB)32:63=0)
SO ← SO | OV
if Rc=1 then do
LT ← quotient < 0
GT ← quotient > 0
EQ ← quotient = 0
CR0 ← LT || GT || EQ || SO
GPR(RT)32:63 ← quotient
GPR(RT)0:31 ← undefined
The 32-bit quotient of the contents of bits 32:63 of GPR(RA) divided by the con-
tents of bits 32:63 of GPR(RB) is placed into bits 32:63 of GPR(RT). Bits 0:31 of
GPR(RT) are undefined. The remainder is not supplied as a result.
Both operands and the quotient are interpreted as signed integers. The quotient is
the unique signed integer that satisfies
where 0 ≤ r < |divisor| if the dividend is nonnegative, and –|divisor| < r ≤ 0 if the
dividend is negative.
0x8000_0000 ÷ -1
<anything> ÷ 0
then the contents of GPR(RT) are undefined as are (if Rc=1) the contents of the LT,
GT, and EQ bits of CR Field 0. In these cases, if OE=1 then OV is set to 1.
Programming Note
The 32-bit signed remainder of dividing GPR(RA)32:63 by GPR(RB)32:63 can be computed
as follows, except in the case that GPR(RA)32:63 = –231 and GPR(RB)32:63 = –1.
dividend0:31 ← GPR(RA)32:63
divisor0:31 ← GPR(RB)32:63
quotient0:31 ← dividend ÷ divisor
if OE=1 then do
OV ← (GPR(RB)32:63=0)
SO ← SO | OV
if Rc=1 then do
LT ← quotient < 0
GT ← quotient > 0
EQ ← quotient = 0
CR0 ← LT || GT || EQ || SO
GPR(RT)32:63 ← quotient
GPR(RT)0:31 ← undefined
The 32-bit quotient of the contents of bits 32:63 of GPR(RA) divided by the con-
tents of bits 32:63 of GPR(RB) is placed into bits 32:63 of GPR(RT). Bits 0:31 of
GPR(RT) are undefined. The remainder is not supplied as a result.
Both operands and the quotient are interpreted as unsigned integers, except that
if Rc=1 the first three bits of CR Field 0 are set by signed comparison of the result
to zero. The quotient is the unique unsigned integer that satisfies
<anything> ÷ 0
then the contents of GPR(RT) are undefined as are (if Rc=1) the contents of the LT,
GT, and EQ bits of CR Field 0. In this case, if OE=1 then OV is set to 1.
Programming Note
The 32-bit unsigned remainder of dividing GPR(RA)32:63 by GPR(RB)32:63 can be com-
puted as follows.
The contents of GPR(RS) are XORed with the contents of GPR(RB) and the one’s
complement of the result is placed into GPR(RA).
if ‘extsb[.]’ then n ← 56
if ‘extsh[.]’ then n ← 48
if ‘extsw’ then n ← 32
if Rc=1 then do
LT ← GPR(RS)n:63 < 0
GT ← GPR(RS)n:63 > 0
EQ ← GPR(RS)n:63 = 0
CR0 ← LT || GT || EQ || SO
s ← GPR(RS)n
GPR(RA) ← ns || GPR(RS)n:63
For extsb[.], the contents of bits 56:63 of GPR(RS) are placed into bits 56:63 of
GPR(RA). Bit 56 of the contents of GPR(RS) is copied into bits 0:55 of GPR(RA). If
Rc=1, CR Field 0 is set to reflect the result.
For extsh[.], the contents of bits 48:63 of GPR(RS) are placed into bits 48:63 of
GPR(RA). Bit 48 of the contents of GPR(RS) is copied into bits 0:47 of GPR(RA). If
Rc=1, CR Field 0 is set to reflect the result.
For extsw, the contents of bits 32:63 of GPR(RS) are placed into the contents of
bits 32:63 of GPR(RA). Bit 32 of the contents of GPR(RS) is copied into bits 0:31 of
GPR(RA). If Rc=1, the instruction form is invalid.
FPR(FRT) ← 0b0||FPR(FRB)1:63
The contents of FPR(FRB) with bit 0 set to zero are placed into FPR(FRT).
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register and placed into FPR(FRT).
If a carry occurs, the sum's significand is shifted right one bit position and the
exponent is increased by one.
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
fcfid FRT,FRB
1 1 1 1 1 1 FRT /// FRB 1 1 0 1 0 0 1 1 1 0 /
0 6 11 16 21 31
sign ← FPR(FRB)0
exp ← 63
frac0:63 ← FPR(FRB)
If frac0:63 = 0 then go to Zero Operand
If sign = 1 then frac0:63 ← ¬frac0:63 + 1
Do while frac0 = 0 /* do loop 0 times if FPR(FRB) = max negative integer */
frac0:63 ← frac1:63 || 0b0
exp ← exp – 1
End
Zero Operand:
FPSCRFR FI ← 0b00
FPSCRFPRF ← ‘+zero’
FPR(FRT) ← 0x0000_0000_0000_0000
Done
FPSCRFPRF is set to the class and sign of the result. FPSCRFR is set if the result is
incremented when rounded. FPSCRFI is set if the result is inexact.
if FPR(FRA) is a NaN or
FPR(FRB) is a NaN then c ← 0b0001
else if FPR(FRA) < FPR(FRB) then c ← 0b1000
else if FPR(FRA) > FPR(FRB) then c ← 0b0100
else c ← 0b0010
FPCC ← c
CR4×BF:4×BF+3 ← c
if ‘fcmpu’ then do
if FPR(FRA) is a SNaN or FPR(FRB) is a SNaN then
VXSNAN ← 1
if ‘fcmpo’ then do
if FPR(FRA) is a SNaN or FPR(FRB) is a SNaN then do
VXSNAN ← 1
if VE=0 then VXVC ← 1
else if FPR(FRA) is a QNaN or FPR(FRB) is a QNaN then VXVC ← 1
If either of the operands is a NaN, either quiet or signaling, then CR field BF and
the FPCC are set to reflect unordered.
If fcmpu then if either of the operands is a Signaling NaN, then VXSNAN is set.
Infinity Operand:
FPSCRFR FI VXCVI ← 0b001
If FPSCRVE = 0 then Do
If sign = 0 then FPR(FRT) ← 0x7FFF_FFFF_FFFF_FFFF
SNaN Operand:
FPSCRFR FI VXSNAN VXCVI ← 0b0011
If FPSCRVE = 0 then Do
FPR(FRT) ← 0x8000_0000_0000_0000
FPSCRFPRF ← undefined
End
Done
QNaN Operand:
FPSCRFR FI VXCVI ← 0b001
If FPSCRVE = 0 then Do
FPR(FRT) ← 0x8000_0000_0000_0000
FPSCRFPRF ← undefined
End
Done
Large Operand:
FPSCRFR FI VXCVI ← 0b001
If FPSCRVE = 0 then Do
If sign = 0 then FPR(FRT) ← 0x7FFF_FFFF_FFFF_FFFF
If sign = 1 then FPR(FRT) ← 0x8000_0000_0000_0000
FPSCRFPRF ← undefined
End
Done
SNaN Operand:
FPSCRFR FI VXSNAN VXCVI ← 0b0011
If FPSCRVE = 0 then Do /* u is undefined hex digit */
FPR(FRT) ← 0xuuuu_uuuu_8000_0000
FPSCRFPRF ← undefined
End
Done
QNaN Operand:
FPSCRFR FI VXCVI ← 0b001
If FPSCRVE = 0 then Do /* u is undefined hex digit */
FPR(FRT) ← 0xuuuu_uuuu_8000_0000
FPSCRFPRF ← undefined
End
Done
Large Operand:
FPSCRFR FI VXCVI ← 0b001
If FPSCRVE = 0 then Do /* u is undefined hex digit */
If sign = 0 then FPR(FRT) ← 0xuuuu_uuuu_7FFF_FFFF
If sign = 1 then FPR(FRT) ← 0xuuuu_uuuu_8000_0000
FPSCRFPRF ← undefined
End
Done
If the operand in FPR(FRB) is greater than 231–1, then bits 32:63 of FPR(FRT) are
set to 0x7FFF_FFFF. If the operand in FPR(FRB) is less than –231, then bits 32:63
of FPR(FRT) are set to 0x8000_0000.
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register and placed into FPR(FRT).
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1 and Zero Divide Exceptions when FPSCRZE=1.
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register and placed into FPR(FRT).
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
FPR(FRT) ← FPR(FRB)
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register and placed into FPR(FRT).
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register and placed into FPR(FRT).
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
FPR(FRT) ← 0b1||FPR(FRB)1:63
The contents of FPR(FRB) with bit 0 set to one are placed into FPR(FRT).
Floating Negate
FPR(FRT) ← ¬FPR(FRB)0||FPR(FRB)1:63
The contents of FPR(FRB) with bit 0 inverted are placed into FPR(FRT).
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register, then negated and placed into FPR(FRT).
This instruction produces the same result as would be obtained by using the
Floating Multiply-Add instruction and then negating the result, with the following
exceptions.
• QNaNs that are generated as the result of a disabled Invalid Operation Excep-
tion have a ‘sign’ bit of 0.
• SNaNs that are converted to QNaNs as the result of a disabled Invalid Opera-
tion Exception retain the ‘sign’ bit of the SNaN.
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register, then negated and placed into FPR(FRT).
This instruction produces the same result as would be obtained by using the
Floating Multiply-Subtract instruction and then negating the result, with the fol-
lowing exceptions.
• QNaNs that are generated as the result of a disabled Invalid Operation Excep-
tion have a ‘sign’ bit of 0.
• SNaNs that are converted to QNaNs as the result of a disabled Invalid Opera-
tion Exception retain the ‘sign’ bit of the SNaN.
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
1
estimate - -- 1
x
----------------------------------- ≤ -256
-------
1
--
x
where x is the initial value in FPR(FRB). Note that the value placed into FPR(FRT)
may vary between implementations, and between different executions on the
same implementation.
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1 and Zero Divide Exceptions when FPSCRZE=1.
Architecture Note
No double-precision version of this instruction is provided because graphics applica-
tions are expected to need only the single-precision version, and no other important
performance-critical applications are expected to need a double-precision version.
Denormalize operand:
G || R || X ← 0b000
Do while exp < –126
exp ← exp + 1
frac0:52 || G || R || X ← 0b0 || frac0:52 || G || (R | X)
Zero Operand:
FPR(FRT) ← FPR(FRB)
If FPR(FRB)0 = 0 then FPSCRFPRF ← ‘+zero’
If FPR(FRB)0 = 1 then FPSCRFPRF ← ‘–zero’
FPSCRFR FI ← 0b00
Done
Infinity Operand:
FPR(FRT) ← FPR(FRB)
If FPR(FRB)0 = 0 then FPSCRFPRF ← ‘+infinity’
If FPR(FRB)0 = 1 then FPSCRFPRF ← ‘–infinity’
FPSCRFR FI ← 0b00
Done
SNaN Operand:
FPSCRVXSNAN ← 1
If FPSCRVE = 0 then Do
FPR(FRT)0:11 ← FPR(FRB)0:11
FPR(FRT)12 ← 1
FPR(FRT)13:63 ← FPR(FRB)13:34 || 290
FPSCRFPRF ← ‘QNaN’
FPSCRFR FI ← 0b00
Done
Normal Operand:
sign ← FPR(FRB)0
exp ← FPR(FRB)1:11 – 1023
frac0:52 ← 0b1 || FPR(FRB)12:63
Round Single(sign,exp,frac0:52,0,0,0)
FPSCRXX ← FPSCRXX | FPSCRFI
If exp > 127 and FPSCROE = 0 then go to Disabled Exponent Overflow
If exp > 127 and FPSCROE = 1 then go to Enabled Overflow
FPR(FRT)0 ← sign
FPR(FRT)1:11 ← exp + 1023
FPR(FRT)12:63 ← frac1:52
If sign = 0 then FPSCRFPRF ← ‘+normal number’
If sign = 1 then FPSCRFPRF ← ‘–normal number’
Done
Round Single(sign,exp,frac0:52,G,R,X):
inc ← 0
lsb ← frac23
gbit ← frac24
rbit ← frac25
xbit ← (frac26:52||G||R||X)≠0
If FPSCRRN = 0b00 then Do /* comparison ignores u bits */
If sign || lsb || gbit || rbit || xbit = 0bu11uu then inc ← 1
If sign || lsb || gbit || rbit || xbit = 0bu011u then inc ← 1
If sign || lsb || gbit || rbit || xbit = 0bu01u1 then inc ← 1
If FPSCRRN = 0b10 then Do /* comparison ignores u bits */
If sign || lsb || gbit || rbit || xbit = 0b0u1uu then inc ← 1
If sign || lsb || gbit || rbit || xbit = 0b0uu1u then inc ← 1
If sign || lsb || gbit || rbit || xbit = 0b0uuu1 then inc ← 1
If FPSCRRN = 0b11 then Do /* comparison ignores u bits */
If sign || lsb || gbit || rbit || xbit = 0b1u1uu then inc ← 1
If sign || lsb || gbit || rbit || xbit = 0b1uu1u then inc ← 1
If sign || lsb || gbit || rbit || xbit = 0b1uuu1 then inc ← 1
frac0:23 ← frac0:23 + inc
If carry_out = 1 then Do
frac0:23 ← 0b1 || frac0:22
exp ← exp + 1
frac24:52 ← 290
FPSCRFR ← inc
FPSCRFI ← gbit | rbit | xbit
Return
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
estimate - ------ 1
-
x 1
--------------------------------------------- ≤ -32
----
1
-------
x
where x is the initial value in FPR(FRB). Note that the value placed into FPR(FRT)
may vary between implementations, and between different executions on the
same implementation.
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1 and Zero Divide Exceptions when FPSCRZE=1.
Architecture Note
No single-precision version of this instruction is provided because it would be super-
flous: if (FRB) is representable in single format, then so is (FRT).
The floating-point operand in FPR(FRA) is compared to the value zero. If the oper-
and is greater than or equal to zero, FPR(FRT) is set to the contents of FPR(FRC).
If the operand is less than zero or is a NaN, FPR(FRT) is set to the contents of
FPR(FRB). The comparison ignores the sign of zero (i.e., regards +0 as equal to –0).
Architecture Note
The Select instruction is similar to a Move instruction, and therefore does not alter the
Floating-Point Status and Control Register.
Programming Note
Examples of uses of this instruction can be found in Appendix C.4.
Warning: Care must be taken in using fsel if IEEE compatibility is required, or if the
values being tested can be NaNs or infinities; see Section C.4.4 on page 396.
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register and placed into FPR(FRT).
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
If the most significant bit of the resultant significand is not 1, the result is nor-
malized. The result is rounded to the target precision under control of the Float-
ing-Point Rounding Control field RN of the Floating-Point Status and Control
Register and placed into FPR(FRT).
FPSCRFPRF is set to the class and sign of the result, except for Invalid Operation
Exceptions when FPSCRVE=1.
If the block containing the byte addressed by EA is in storage that is not Memory
Coherence Required and a block containing the byte addressed by EA is in the
instruction cache of this processor, the block is invalidated in that instruction
cache, so that subsequent references cause the block to be fetched from main
storage.
This instruction may cause a cache locking exception. See the User’s Manual for
the implementation.
This instruction treated as a Load (see Section 6.2.4.4), except that an interrupt is
not taken for a translation or protection violation.
isync
0 1 0 0 1 1 /// 0 0 1 0 0 1 0 1 1 0 /
0 6 21 31
The isync instruction provides an ordering function for the effects of all instruc-
tions executed by the processor executing the isync instruction. Executing an
isync instruction ensures that all instructions preceding the isync instruction
have completed before the isync instruction completes, and that no subsequent
instructions are initiated until after the isync instruction completes. It also
causes any prefetched instructions to be discarded, with the effect that subse-
quent instructions will be fetched and executed in the context established by the
instructions preceding the isync instruction.
The isync instruction may complete before storage accesses associated with
instructions preceding the isync instruction have been performed.
• For lbz and lbzu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For lbzx and lbzux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For lbze and lbzue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DE instruction field.
• For lbzxe and lbzuxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
The byte in storage addressed by EA is loaded into bits 56:63 of GPR(RT). Bits
0:55 of GPR(RT) are set to 0.
If U=1 (‘with update’), and RA=0 or RA=RT, the instruction form is invalid.
ldarxe RT,RA,RB
0 1 1 1 1 1 RT RA RB 0 1 1 1 0 1 1 1 1 1 /
0 6 11 16 21 31
Let the effective address (EA) be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the contents of GPR(RB).
• For lde and ldue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DES instruction field concatenated
with 0b00.
• For ldxe and lduxe, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the contents of GPR(RB).
If U=1 (‘with update’), and RA=0 or RA=RT, the instruction form is invalid.
• For lfd and lfdu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For lfdx and lfdux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For lfde and lfdue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DES instruction fielfd concatenated
with 0b00.
• For lfdxe and lfduxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
• For lfs and lfsu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For lfsx and lfsux, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For lfse and lfsue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DES instruction fielfs concatenated
with 0b00.
• For lfsxe and lfsuxe, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the contents of GPR(RB).
• For lha and lhau, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For lhax and lhaux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For lhae and lhaue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DE instruction field.
• For lhaxe and lhauxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
The halfword in storage addressed by EA is loaded into bits 48:63 of GPR(RT). Bits
32:47 of GPR(RT) are filled with a copy of bit 0 of the loaded halfword. Bits 0:31 of
GPR(RT) are set to 0.
If U=1 (‘with update’), and RA=0 or RA=RT, the instruction form is invalid.
• For lhbrx, let EA be 32 0s concatenated with bits 32:63 of the sum of the con-
tents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
Bits 0:7 of the halfword in storage addressed by EA are loaded into bits 56:63 of
GPR(RT). Bits 8:15 of the halfword in storage addressed by EA are loaded into
bits 48:55 of GPR(RT). Bits 0:47 of GPR(RT) are set to 0.
Programming Note
When EA references Big-Endian storage, these instructions have the effect of loading
data in Little-Endian byte order. Likewise, when EA references Little-Endian storage,
these instructions have the effect of loading data in Big-Endian byte order.
Programming Note
In some implementations, the Load Halfword Byte-Reverse Indexed instructions may
have greater latency than other Load instructions.
• For lhz and lhzu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For lhzx and lhzux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For lhze and lhzue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DE instruction field.
• For lhzxe and lhzuxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
The halfword in storage addressed by EA is loaded into bits 48:63 of GPR(RT). Bits
0:47 of GPR(RT) are set to 0.
If U=1 (‘with update’), and RA=0 or RA=RT, the instruction form is invalid.
lmw RT,D(RA)
1 0 1 1 1 0 RT RA D
0 6 11 16 31
Let the effective address (EA) be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the D
instruction field.
Engineering Note
Causing an Alignment interrupt if an attempt is made to execute a Load Multiple
instruction having an incorrectly aligned effective address facilitates the debugging of
software.
Architecture Note
Extended addressing modes are not defined for Load Multiple. Doubleword forms of Load
Multiple are not defined.
lswi RT,RA,NB
0 1 1 1 1 1 RT RA NB 1 0 0 1 0 1 0 1 0 1 /
0 6 11 16 21 31
lswx RT,RA,RB
0 1 1 1 1 1 RT RA RB 1 0 0 0 0 1 0 1 0 1 /
0 6 11 16 21 31
• For lswx, let EA be 32 0s concatenated with bits 32:63 of the sum of the con-
tents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
If lswi then n=NB if NB≠0, n=32 if NB=0. If lswx then n=XER57:63. n is the num-
ber of bytes to load. Let nr=CEIL(n÷4): nr is the number of registers to receive
data.
If n>0, n consecutive bytes in storage starting at address EA are loaded into regis-
ters GPR(RT) through GPR(RT+nr–1). Data are loaded into the low-order four
bytes of each GPR; the high-order four bytes are set to 0.
Bytes are loaded left to right in each register. The sequence of registers wraps
around to GPR(0) if required. If the low-order four bytes of GPR(RT+nr–1) are only
partially filled, the unfilled low-order byte(s) of that register are set to 0.
If RA, or RB for lswx, is in the range of registers to be loaded, including the case
in which RA=0, either an Illegal Instruction type Program interrupt is invoked or
the results are boundedly undefined. If RT=RA, or RT=RB for lswx, the instruc-
tion form is invalid.
Architecture Note
Extended addressing modes are not defined for the Load String Word instructions. Dou-
bleword forms of the Load String Word instructions are not defined.
• For lwarx, let EA be 32 0s concatenated with bits 32:63 of the sum of the
contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
This instruction creates a reservation for use by a Store Word Conditional instruc-
tion. An address computed from the EA is associated with the reservation and
replaces any address previously associated with the reservation: the manner in
which the address to be associated with the reservation is computed from the EA
is described in Section 6.1.6.2 on page 117.
Programming Note
lwarx, lwarxe, and ldarxe, in combination with stwcx., stwcxe., and stdcxe., permit
the programmer to write a sequence of instructions that appear to perform an atomic
update operation on a storage location. This operation depends upon a single reserva-
tion resource in each processor. At most one reservation exists on any given processor:
there are not separate reservations for words and for doublewords.
Programming Note
Because lwarx, lwarxe, and ldarxe have implementation dependencies (e.g., the gran-
ularity at which reservations are managed), they must be used with care. The operating
system should provide system library programs that use these instructions to imple-
ment the high-level synchronization functions (Test and Set, Compare and Swap, etc.)
needed by application programs. Application programs should use these library pro-
grams, rather than use lwarx, lwarxe, and ldarxe directly.
Architecture Note
lwarx, lwarxe, and ldarxe require the EA to be aligned. Software should not attempt to
emulate an unaligned lwarx, lwarxe, or ldarxe, because there is no correct way to
define the address associated with the reservation.
Programming Note
The granularity with which reservations are managed is implementation-dependent.
Therefore the storage to be accessed by lwarx, lwarxe, or ldarxe should be allocated by
a system library program. Additional information can be found in Section 6.1.6.2 on
page 117.
• For lwbrx, let EA be 32 0s concatenated with bits 32:63 of the sum of the
contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
Bits 0:7 of the word in storage addressed by EA are loaded into bits 56:63 of
GPR(RT). Bits 8:15 of the word in storage addressed by EA are loaded into bits
48:55 of GPR(RT). Bits 16:23 of the word in storage addressed by EA are loaded
into bits 40:47 of GPR(RT). Bits 24:31 of the word in storage addressed by EA are
loaded into bits 32:39 of GPR(RT). Bits 0:31 of GPR(RT) are set to 0.
Programming Note
When EA references Big-Endian storage, these instructions have the effect of loading
data in Little-Endian byte order. Likewise, when EA references Little-Endian storage,
these instructions have the effect of loading data in Big-Endian byte order.
Programming Note
In some implementations, the Load Word Byte-Reverse instructions may have greater
latency than other Load instructions.
• For lwz and lwzu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For lwzx and lwzux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For lwze and lwzue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DE instruction field.
• For lwzxe and lwzuxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
The word in storage addressed by the EA is loaded into bits 32:63 of GPR(RT). Bits
0:31 of GPR(RT) are set to 0.
If U=1 (‘with update’), and RA=0 or RA=RT, the instruction form is invalid.
mbar MO
0 1 1 1 1 1 MO /// 1 1 0 1 0 1 0 1 1 0 /
0 6 11 21 31
When MO=0, the mbar instruction provides a storage ordering function for all
storage access instructions executed by the processor executing the mbar
instruction. Executing an mbar instruction ensures that all data storage accesses
caused by instructions preceding the mbar instruction have completed before any
data storage accesses caused by any instructions after the mbar instruction. This
order is seen by all mechanisms.
Programming Note
mbar is provided to implement a pipelined storage barrier. The following sequence illus-
trates one use of mbar in supporting shared data, ensuring the action is completed
prior to releasing the lock.
P1 P2
lock ...
read & write ...
mbar ...
free lock ...
... lock
... read & write
... mbar
... free lock
mcrf BF,BFA
0 1 0 0 1 1 BF // BFA /// 0 0 0 0 0 0 0 0 0 0 /
0 6 9 11 14 21 31
CR4xBF+32:4xBF+35 ← CR4xBFA+32:4xBFA+35
The contents of field BFA (bits 4×BFA+32:4×BFA+35) of the Condition Register are
copied to field BF (bits 4×BF+32:4×BF+35) of the Condition Register.
mcrfs BF,BFA
1 1 1 1 1 1 BF // BFA /// 0 0 0 1 0 0 0 0 0 0 /
0 6 9 11 14 21 31
CRBF×4:BF×4+3 ← FPSCRBFA×4:BFA×4+3
FPSCRBFA×4:BFA×4+3 ← 0b0000
The contents of Floating-Point Status and Control Register field BFA are copied to
Condition Register field BF. All exception bits copied are set to 0 in the Floating-
Point Status and Control Register. If the FX bit is copied, it is set to 0 in the Float-
ing-Point Status and Control Register.
mcrxr BF
0 1 1 1 1 1 BF /// 1 0 0 0 0 0 0 0 0 0 /
0 6 9 21 31
CR4×BF+32:4×BF+35 ← XER32:35
XER32:35 ← 0b0000
The contents of bits 32:35 of the Integer Exception Register are copied to Condi-
tion Register field BF. Bits 32:35 of the Integer Exception Register are set to zero.
mcrxr64 BF
0 1 1 1 1 1 BF /// 1 0 0 0 1 0 0 0 0 0 /
0 6 9 21 31
CR4×BF+32:4×BF+35 ← XER0:3
XER0:3 ← 0b0000
The contents of bits 0:3 of the Integer Exception Register are copied to Condition
Register field BF. Bits 0:3 of the Integer Exception Register are set to zero.
mfapidi RT,RA
0 1 1 1 1 1 RT RA /// 0 1 0 0 0 1 0 0 1 1 /
0 6 11 16 21 31
The contents of GPR(RA) are provided to any auxiliary processing extensions that
may be present. A value, that is implementation-dependent and extension-depen-
dent, is placed in GPR(RT).
Programming Note
This instruction is provided as a mechanism for software to query the presence and con-
figuration of one or more auxiliary processing extensions. See User’s Manual for the
implementation for details on the behavior of this instruction.
mfcr RT
0 1 1 1 1 1 RT /// 0 0 0 0 0 1 0 0 1 1 /
0 6 11 21 31
GPR(RT) ← 320 || CR
The contents of the Condition Register are placed into bits 32:63 of GPR(RT). Bits
0:31 of GPR(RT) are set to 0.
mfdcr RT,DCRN
0 1 1 1 1 1 RT dcrn5:9 dcrn0:4 0 1 0 1 0 0 0 0 1 1 /
0 6 11 16 21 31
Let DCRN denote a Device Control Register (see User’s Manual for a list of the
Device Control Registers supported by the implemention).
The contents of the designated Device Control Register are placed into GPR(RT).
For 32-bit Device Control Registers, the contents of the Device Control Register
are placed into bits 32:63 of GPR(RT). Bits 0:31 of GPR(RT) are set to 0.
FPR(FRT) ← FPSCR
The contents of the Floating-Point Status and Control Register are placed into bits
32:63 of FPR(FRT). Bits 0:31 of FPR(FRT) are undefined.
mfmsr RT
0 1 1 1 1 1 RT /// 0 0 0 1 0 1 0 0 1 1 /
0 6 11 21 31
The contents of the MSR are placed into bits 32:63 of GPR(RT). Bits 0:31 of
GPR(RT) are set to 0.
mfspr RT,SPRN
0 1 1 1 1 1 RT sprn5:9 sprn0:4 0 1 0 1 0 1 0 0 1 1 /
0 6 11 16 21 31
Let SPRN denote a Special Purpose Register (see Section B.1 for a list of Special
Purpose Registers defined by Book E, Section B.3 for a list of SPRN values
reserved by Book E, Section B.4 for a list of SPRN values allocated by Book E, and
Section B.2 for a list of Special Purpose Registers preserved by Book E).
The contents of the designated Special Purpose Register are placed into GPR(RT).
For 32-bit Special Purpose Registers, the contents of the Special Purpose Register
are placed into bits 32:63 of GPR(RT). Bits 0:31 of GPR(RT) are set to 0.
SPRN
SPRN5 MSRPR Result
Class
0 1 defined if not implemented: Illegal Instruction exception
if implemented: as defined in Book E
0 1 allocated if not implemented: Illegal Instruction exception
if implemented: as defined in User’s Manual
0 1 preserved if not implemented: Illegal Instruction exception
if implemented: as defined in PowerPC Architecture
0 1 reserved Illegal Instruction exception
1 1 — Privileged exception
— 0 defined if not implemented: boundedly undefined
if implemented: as defined in Book E
— 0 allocated if not implemented: boundedly undefined
if implemented: as defined in User’s Manual
— 0 preserved if not implemented: boundedly undefined
if implemented: as defined in PowerPC Architecture
— 0 reserved boundedly undefined
msync
0 1 1 1 1 1 /// 1 0 0 1 0 1 0 1 1 0 /
0 6 21 31
The msync instruction provides an ordering function for the effects of all instruc-
tions executed by the processor executing the msync instruction. Executing a
msync instruction ensures that all instructions preceding the msync instruction
have completed before the msync instruction completes, and that no subsequent
instructions are initiated until after the msync instruction completes. It also cre-
ates a memory barrier (see Section 6.1.6.1 on page 114), which orders the storage
accesses associated with these instructions.
The msync instruction may not complete before storage accesses associated with
instructions preceding the msync instruction have been performed.
Programming Note
The msync instruction can be used to ensure that all stores into a data structure,
caused by Store instructions executed in a ‘critical section’ of a program, will be per-
formed with respect to another processor before the store that releases the lock is
performed with respect to that processor.
The functions performed by the msync instruction may take a significant amount of
time to complete, so indiscriminate use of this instruction may adversely affect perfor-
mance. The Memory Barrier (mbar) instruction on page 304 may be more appropriate
than msync for many cases.
Engineering Note
Unlike a context synchronizing operation, msync need not discard prefetched
instructions.
Programming Note
msync replaces the PowerPC sync instruction. msync uses the same opcode as sync
such that PowerPC applications calling for a sync instruction will get the Book E msync
when executed on an Book E implementation. The functionality of msync is identical to
sync except that msync also does not complete until all previous storage accesses com-
plete. mbar is provided in the Book E for those occasions when only ordering of storage
accesses is required without execution synchronization.
mtcrf FXM,RS
0 1 1 1 1 1 RS / FXM / 0 0 1 0 0 1 0 0 0 0 /
0 6 11 12 20 21 31
i ← 0
do while i < 8
if FXMi=1 then CR4×i+32:4×i+35 ← GPR(RS)4×i+32:4×i+35
i ← i+1
The contents of bits 32:63 of GPR(RS) are placed into the Condition Register
under control of the field mask specified by FXM. The field mask identifies the 4-
bit fields affected. Let i be an integer in the range 0-7. If FXMi = 1 then CR field i
(CR bits 4×i+32 through 4×i+35) is set to the contents of the corresponding field of
bits 32:63 of GPR(RS).
mtdcr DCRN,RS
0 1 1 1 1 1 RS dcrn5:9 dcrn0:4 0 1 1 1 0 0 0 0 1 1 /
0 6 11 16 21 31
Let DCRN denote a Device Control Register (see User’s Manual for a list of the
Device Control Registers supported by the implemention).
The contents of GPR(RS) are placed into the designated Device Control Register.
For 32-bit Device Control Registers, the contents of bits 32:63 of GPR(RS) are
placed into the Device Control Register.
FPSCRBT+32 ← 0b0
Programming Note
Bits 33 and 34 (FEX and VX) cannot be explicitly reset.
FPSCRBT+32 ← 0b1
Programming Note
Bits 33 and 34 (FEX and VX) cannot be explicitly set.
i ← 0
do while i<8
if FLMi=1 then FPSCR4×i+32:4×i+35 ← FPR(FRB)4×i+32:4×i+35
i ← i+1
The contents of bits 32:63 of FPR(FRB) are placed into the Floating-Point Status
and Control Register under control of the field mask specified by FLM. The field
mask identifies the 4-bit fields affected. Let i be an integer in the range 0-7. If
FLMi=1 then Floating-Point Status and Control Register field i (FPSCR bits 4×i+32
through 4×i+35) is set to the contents of the corresponding field of the low-order
32 bits of FPR(FRB).
Programming Note
Updating fewer than all eight fields of the Floating-Point Status and Control Register
may have substantially poorer performance on some implementations than updating all
the fields.
Programming Note
When FPSCR32:35 is specified, bits 32 (FX) and 35 (OX) are set to the values of (FRB)32
and (FRB)35 (i.e., even if this instruction causes OX to change from 0 to 1, FX is set from
(FRB)32 and not by the usual rule that FX is set to 1 when an exception bit changes
from 0 to 1). Bits 33 and 34 (FEX and VX) are set according to the usual rule, given on
page 70, and not from (FRB)33:34.
FPSCRBF×4+32:BF×4+35 ← U
The value of the U field is placed into Floating-Point Status and Control Register
field BF.
Programming Note
When FPSCR32:35 is specified, bits 32 (FX) and 35 (OX) are set to the values of U0 and
U3 (i.e., even if this instruction causes OX to change from 0 to 1, FX is set from U0 and
not by the usual rule that FX is set to 1 when an exception bit changes from 0 to 1). Bits
33 and 34 (FEX and VX) are set according to the usual rule, given on page 70, and not
from U1:2.
mtmsr RS
0 1 1 1 1 1 RS /// 0 0 1 0 0 1 0 0 1 0 /
0 6 11 21 31
MSR ← GPR(RS)32:63
The contents of bits 32:63 of GPR(RS) are placed into the MSR.
Programming Note
For a discussion of software synchronization requirements when altering certain MSR
bits please refer to Chapter 11, “Synchronization Requirements”, on page 225.
mtspr SPRN,RS
0 1 1 1 1 1 RS sprn5:9 sprn0:4 0 1 1 1 0 1 0 0 1 1 /
0 6 11 16 21 31
Let SPRN denote a Special Purpose Register (see Section B.1 for a list of Special
Purpose Registers defined by Book E, Section B.3 for a list of SPRN values
reserved by Book E, Section B.4 for a list of SPRN values allocated by Book E,
Section B.2 for a list of SPRN values preserved by Book E, and the User’s Manual
of the implementation for a list of all Special Purpose Registers that are imple-
mented).
The contents of GPR(RS) are placed into the designated Special Purpose Register.
For 32-bit Special Purpose Registers, the contents of bits 32:63 of GPR(RS) are
placed into the Special Purpose Register.
Programming Note
For a discussion of software synchronization requirements when altering certain Special
Purpose Registers, please refer to Chapter 11, “Synchronization Requirements”, on
page 225.
mulhd RT,RA,RB
0 1 1 1 1 1 RT RA RB / 0 0 1 0 0 1 0 0 1 /
0 6 11 16 21 22 31
Bits 0:63 of the 128-bit product of the contents of GPR(RA) and the contents of
GPR(RB) are placed into GPR(RT).
mulhdu RT,RA,RB
0 1 1 1 1 1 RT RA RB / 0 0 0 0 0 1 0 0 1 /
0 6 11 16 21 22 31
Bits 0:63 of the 128-bit product the contents of GPR(RA) and the contents of
GPR(RB) are placed into GPR(RT).
Bits 0:31 of the 64-bit product of the contents of bits 32:63 of GPR(RA) and the
contents of bits 32:63 of GPR(RB) are placed into bits 32:63 of GPR(RT). Bits 0:31
of GPR(RT) are undefined.
Bits 0:31 of the 64-bit product the contents of bits 32:63 of GPR(RA) and the con-
tents of bits 32:63 of GPR(RB) are placed into bits 32:63 of GPR(RT). Bits 0:31 of
GPR(RT) are undefined.
Both operands and the product are interpreted as unsigned integers, except that
if Rc=1 the first three bits of CR Field 0 are set by signed comparison of the result
to zero.
Bits 64:127 of the 128-bit product of the contents of GPR(RA) and the contents of
GPR(RB) are placed into GPR(RT).
Programming Note
The Multiply instructions that set the XER may execute faster on some implementations
if GPR(RB) contains the operand having the smaller absolute value.
mulli RT,RA,SI
0 0 0 1 1 1 RT RA SI
0 5 6 11 16 31
Bits 64:127 of the 128-bit product of the contents of GPR(RA) and the sign-
extended value of the SI field are placed into GPR(RT).
The 64-bit product of the contents of bits 32:63 of GPR(RA) and the contents of
bits 32:63 of GPR(RB) is placed into GPR(RT).
Programming Note
For mulli and mulld, the low-order 64 bits of the product are independent of whether
the operands are regarded as signed or unsigned 64-bit integers.
For mulli and mullw, bits 32:63 of the product are independent of whether the oper-
ands are regarded as signed or unsigned 32-bit integers.
The contents of GPR(RS) are ANDed with the contents of GPR(RB) and the one’s
complement of the result is placed into GPR(RA).
carry0:63 ← Carry(¬GPR(RA) + 1)
sum0:63 ← ¬GPR(RA) + 1
if OE=1 then do
OV ← carry32 ⊕ carry33
SO ← SO | (carry32 ⊕ carry33)
OV64 ← carry0 ⊕ carry1
SO64 ← SO64 | (carry0 ⊕ carry1)
if Rc=1 then do
LT ← sum32:63 < 0
GT ← sum32:63 > 0
EQ ← sum32:63 = 0
CR0 ← LT || GT || EQ || SO
GPR(RT) ← sum
The sum of the one’s complement of the contents of GPR(RA) and 1 is placed into
GPR(RT).
The contents of GPR(RS) are ORed with the contents of GPR(RB) and the one’s
complement of the result is placed into GPR(RA).
or RA,RS,RB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (Rc=0)
or. RA,RS,RB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (Rc=1)
0 1 1 1 1 1 RS RA RB 0 1 1 0 1 1 1 1 0 0 Rc
0 6 11 16 21 31
For ori, the contents of GPR(RS) are ORed with 480 || UI.
For oris, the contents of GPR(RS) are ORed with 320 || UI || 160.
For or[.], the contents of GPR(RS) are ORed with the contents of GPR(RB).
For orc[.], the contents of GPR(RS) are ORed with the one’s complement of the
contents of GPR(RB).
ori 0,0,0
Engineering Note
It is desirable for implementations to make the preferred form of no-op execute quickly,
since this form should be used by compilers.
rfci
0 1 0 0 1 1 /// 0 0 0 0 1 1 0 0 1 1 /
0 6 21 31
MSR ← CSRR1
NIA ← CSRR00:61 || 0b00
The rfci instruction is used to return from a critical class interrupt, or as a means
of establishing a new context and synchronizing on that new context simulta-
neously.
The contents of Critical Save/Restore Register 1 are placed into the Machine State
Register. If the new Machine State Register value does not enable any pending
exceptions, then the next instruction is fetched, under control of the new Machine
State Register value, from the address CSRR00:61||0b00. If the new Machine State
Register value enables one or more pending exceptions, the interrupt associated
with the highest priority pending exception is generated; in this case the value
placed into Save/Restore Register 0 or Critical Save/Restore Register 0 by the
interrupt processing mechanism (see Section 7.5 on page 151) is the address of
the instruction that would have been executed next had the interrupt not
occurred (i.e. the address in Critical Save/Restore Register 0 at the time of the
execution of the rfci).
Programming Note
In addition to Branch to Link Register (bclr[e][l]) and Branch to Count Register (bcctr[e][l])
instructions, rfi and rfci allow software to branch to any valid 64-bit address by using
the respective 64-bit Save/Restore Register 0 and Critical Save/Restore Register 0.
rfi
0 1 0 0 1 1 /// 0 0 0 0 1 1 0 0 1 0 /
0 6 21 31
MSR ← SRR1
NIA ← SRR00:61 || 0b00
The contents of Save/Restore Register 1 are placed into the Machine State Regis-
ter. If the new Machine State Register value does not enable any pending excep-
tions, then the next instruction is fetched, under control of the new Machine State
Register value, from the address SRR00:61||0b00. If the new Machine State Regis-
ter value enables one or more pending exceptions, the interrupt associated with
the highest priority pending exception is generated; in this case the value placed
into Save/Restore Register 0 or Critical Save/Restore Register 0 by the interrupt
processing mechanism (see Section 7.5 on page 151) is the address of the instruc-
tion that would have been executed next had the interrupt not occurred (i.e. the
address in Save/Restore Register 0 at the time of the execution of the rfi).
rldcl RA,RS,RB,mb
0 1 1 1 1 0 RS RA RB mb1:5 mb0 1 0 0 0 /
0 6 11 16 21 26 27 31
rldicl RA,RS,sh,mb
0 1 1 1 1 0 RS RA sh1:5 mb1:5 mb0 0 0 0 sh0 /
0 6 11 16 21 26 27 30 31
If rldcl, let the shift count n be the contents of bits 58:63 of GPR(RB).
The contents of GPR(RS) are rotated64 left n bits. A mask is generated having ‘1’
bits from bit mb through bit 63 and ‘0’ bits elsewhere. The rotated data are ANDed
with the generated mask and the result is placed into GPR(RA).
Programming Note
• Can be used to extract a k-bit field that starts at • Can be used to extract a k-bit field that starts at bit
variable bit position j in GPR(RS), right-justified into position j in GPR(RS), right-justified into GPR(RA)
GPR(RA) (clearing the remaining 64-k bits of (clearing the remaining 64-k bits of GPR(RA)), by set-
GPR(RA)), by setting GPR(RB)58:63=j+k and mb=64–k. ting sh=j+k and mb=64-k.
• Can be used to rotate the contents of a register left • Can be used to rotate the contents of a register left
by variable k bits, by setting GPR(RB)58:63=k and by k bits, by setting sh=k and mb=0.
mb=0.
• Can be used to rotate the contents of a register right • Can be used to rotate the contents of a register right
by variable k bits, by setting GPR(RB)58:63=64–k and by k bits, by setting sh=64–k and mb=0.
mb=0.
• Can be used to shift the contents of a register right
by k bits, by setting sh=64-k and mb=k.
rldcr RA,RS,RB,me
0 1 1 1 1 0 RS RA RB me1:5 me0 1 0 0 1 /
0 6 11 16 21 26 27 31
rldicr RA,RS,sh,me
0 1 1 1 1 0 RS RA sh1:5 me1:5 me0 0 0 1 sh0 /
0 6 11 16 21 26 27 30 31
If rldcr, let the shift count n be the contents of bits 58:63 of GPR(RB).
The contents of GPR(RS) are rotated64 left n bits. A mask is generated having ‘1’
bits from bit 0 through bit me and ‘0’ bits elsewhere. The rotated data are ANDed
with the generated mask and the result is placed into GPR(RA).
Programming Note
• Can be used to extract a k-bit field that starts at • Can be used to extract a k-bit field that starts at bit
variable bit position j in GPR(RS), left-justified into position j in GPR(RS), left-justified into GPR(RA)
GPR(RA) (clearing the remaining 64-k bits of (clearing the remaining 64-k bits of GPR(RA)), by set-
GPR(RA)), by setting GPR(RB)58:63=j and me=k-1. ting sh=j and me=k-1.
• Can be used to rotate the contents of a register left • Can be used to rotate the contents of a register left
by variable k bits, by setting GPR(RB)58:63=k and by k bits, by setting sh=k and me=63.
me=63.
• Can be used to rotate the contents of a register right • Can be used to rotate the contents of a register right
by variable k bits, by setting GPR(RB)58:63=64-k and by k bits, by setting sh=64–k and me=63.
me=63.
• Can be used to shift the contents of a register left by
k bits, by setting sh=k and me=63-k.
rldic RA,RS,sh,mb
0 1 1 1 1 0 RS RA sh1:5 mb1:5 mb0 0 1 0 sh0 /
0 6 11 16 21 26 27 30 31
n ← sh0 || sh1:5
b ← mb0 || mb1:5
r ← ROTL64(GPR(RS),n)
m ← MASK(b,¬n)
GPR(RA) ← r & m
The contents of GPR(RS) are rotated64 left n bits. A mask is generated having ‘1’
bits from bit mb through bit 63-sh and ‘0’ bits elsewhere. The rotated data are
ANDed with the generated mask and the result is placed into GPR(RA).
Programming Note
• Can be used to clear the high-order j bits of the contents of a register and then shift
the result left by k bits, by setting sh=k and mb=j–k.
• Can be used to clear the high-order k bits of a register, by setting sh=0 and mb=k.
rldimi RA,RS,sh,mb
0 1 1 1 1 0 RS RA sh1:5 mb1:5 mb0 0 1 1 sh0 /
0 6 11 16 21 26 27 30 31
n ← sh0 || sh1:5
b ← mb0 || mb1:5
r ← ROTL64(GPR(RS),n)
m ← MASK(b,¬n)
GPR(RA) ← r&m | GPR(RA)&¬m
The contents of GPR(RS) are rotated64 left n bits. A mask is generated having ‘1’
bits from bit mb through bit 63-sh and ‘0’ bits elsewhere. The rotated data are
inserted into GPR(RA) under control of the generated mask (if a mask bit is 1 the
associated bit of the rotated data is placed into the target register, and if the mask
bit is 0 the associated bit in the target register remains unchanged).
Programming Note
rldimi can be used to insert a k-bit field that is right-justified in GPR(RS), into GPR(RA)
starting at bit position j, by setting sh=64-(j+k) and mb=j.
n ← SH
b ← MB+32
e ← ME+32
r ← ROTL32(GPR(RS)32:63,n)
m ← MASK(b,e)
result0:63 ← r&m | GPR(RA)&¬m
if Rc=1 then do
LT ← result32:63 < 0
GT ← result32:63 > 0
EQ ← result32:63 = 0
CR0 ← LT || GT || EQ || SO
GPR(RA) ← result0:63
The contents of GPR(RS) are rotated32 left n bits. A mask is generated having ‘1’
bits from bit MB+32 through bit ME+32 and ‘0’ bits elsewhere. The rotated data
are inserted into GPR(RA) under control of the generated mask (if a mask bit is 1
the associated bit of the rotated data is placed into the target register, and if the
mask bit is 0 the associated bit in the target register remains unchanged).
Programming Note
• Can be used to insert a k-bit field that is left-justified in bits 32:63 of GPR(RS), into
bits 32:63 of GPR(RA) starting at bit position j, by setting SH=64-j, MB=j-32, and
ME=(j+k)-33.
• Can be used to insert an k-bit field that is right-justified in bits 32:63 of GPR(RS),
into bits 32:63 of GPR(RA) starting at bit position j, by setting SH=64-(j+k), MB=j-32,
and ME=(j+k)-33.
If rlwnm[.], let the shift count n be the contents of bits 59:63 of GPR(RB).
The contents of GPR(RS) are rotated32 left n bits. A mask is generated having ‘1’
bits from bit MB+32 through bit ME+32 and ‘0’ bits elsewhere. The rotated data
are ANDed with the generated mask and the result is placed into GPR(RA).
• Can be used to extract a k-bit field that starts at • Can be used to extract a k-bit field that starts at bit
variable bit position j in bits 32:63 of GPR(RS), right- position j in bits 32:63 of GPR(RS), right-justified
justified into bits 32:63 of GPR(RA) (clearing the into bits 32:63 of GPR(RA) (clearing the remaining
remaining 32–k bits of bits 32:63 of GPR(RA)), by 32–k bits of bits 32:63 of GPR(RA)), by setting
setting GPR(RB)59:63=j+k-32, MB=32–k, and ME=31. SH=j+k-32, MB=32–k, and ME=31.
• Can be used to extract a k-bit field that starts at • Can be used to extract a k-bit field that starts at bit
variable bit position j in bits 32:63 of GPR(RS), left- position j in bits 32:63 of GPR(RS), left-justified into
justified into bits 32:63 of GPR(RA) (clearing the bits 32:63 of GPR(RA) (clearing the remaining 32–k
remaining 32–k bits of bits 32:63 of GPR(RA)), by bits of bits 32:63 of GPR(RA)), by setting SH=j-32,
setting GPR(RB)59:63=j-32, MB=0, and ME=k–1. MB=0, and ME=k–1.
• Can be used to rotate the contents of bits 32:63 of a • Can be used to rotate the contents of bits 32:63 of a
register left by variable k bits, by setting register left by k bits, by setting SH=k, MB=0, and
GPR(RB)59:63=k, MB=0, and ME=31. ME=31.
• Can be used to rotate the contents of bits 32:63 of a • Can be used to rotate the contents of bits 32:63 of a
register right by variable k bits, by setting register right by k bits, by setting SH=32–k, MB=0,
GPR(RB)59:63=32–k, MB=0, and ME=31. and ME=31.
For all the uses given above, bits 0:31 of GPR(RA) are cleared.
sc
0 1 0 0 0 1 /// 1 /
0 6 30 31
SRR1 ← MSR
SRR0 ← CIA+4
NIA ← EVPR0:47 || IVOR848:59 || 0b0000
MSRWE,EE,PR,IS,DS,FP,FE0,FE1 ← 0b0000_0000
The interrupt causes the next instruction to be fetched from the address
IVPR0:47||IVOR848:59||0b0000
sld RA,RS,RB
0 1 1 1 1 1 RS RA RB 0 0 0 0 0 1 1 0 1 1 /
0 6 11 16 21 31
n ← GPR(RB)58:63
r ← ROTL64(GPR(RS),n)
if GPR(RB)57=0 then m ← MASK(0,63-n)
else m ← 640
GPR(RA) ← r & m
Let the shift count n be the value specified by the contents of bits 57:63 of
GPR(RB).
The contents of GPR(RS) are shifted left n bits. Bits shifted out of position 0 are
lost. Zeros are supplied to the vacated positions on the right. The result is placed
into GPR(RA).
n ← GPR(RB)59:63
r ← ROTL32(GPR(RS)32:63,n)
if GPR(RB)58=0 then m ← MASK(32,63-n)
else m ← 640
result0:63 ← r & m
if Rc=1 then do
LT ← result32:63 < 0
GT ← result32:63 > 0
EQ ← result32:63 = 0
CR0 ← LT || GT || EQ || SO
GPR(RA) ← result0:63
Let the shift count n be the value specified by the contents of bits 58:63 of
GPR(RB).
The contents of bits 32:63 of GPR(RS) are shifted left n bits. Bits shifted out of
position 32 are lost. Zeros are supplied to the vacated positions on the right. The
32-bit result is placed into bits 32:63 of GPR(RA). Bits 0:31 of GPR(RA) are set to
zero.
srad RA,RS,RB
0 1 1 1 1 1 RS RA RB 1 1 0 0 0 1 1 0 1 0 /
0 6 11 16 21 31
sradi RA,RS,sh
0 1 1 1 1 1 RS RA sh1:5 1 1 0 0 1 1 1 0 1 sh0 /
0 6 11 16 21 30 31
If srad, let the shift count n be the contents of bits 57:63 of GPR(RB).
The contents of GPR(RS) are shifted right n bits. Bits shifted out of position 63 are
lost. Bit 0 of the contents of GPR(RS) is replicated to fill the vacated positions on
the left. The result is placed into GPR(RA).
CA64 is set to 1 if GPR(RS) contains a negative value and any ‘1’ bits are shifted
out of bit position 63; otherwise CA is set to 0.
A shift amount of zero causes GPR(RA) to be set equal to the contents of GPR(RS),
and CA64 to be set to 0. For srad shift amounts from 64 to 127 give a result of 64
sign bits in GPR(RA), and cause CA64 to receive bit 0 of the contents of GPR(RS)
(i.e. the sign bit of GPR(RS)).
If sraw[.], let the shift count n be the contents of bits 58:63 of GPR(RB).
The contents of bits 32:63 of GPR(RS) are shifted right n bits. Bits shifted out of
position 63 are lost. Bit 32 of RS is replicated to fill the vacated positions on the
left. The 32-bit result is placed into bits 32:63 of GPR(RA). Bit 32 of the contents
of GPR(RS) is replicated to fill bits 0:31 of GPR(RA).
CA is set to 1 if bits 32:63 of GPR(RS) contain a negative value and any ‘1’ bits are
shifted out of bit position 63; otherwise CA is set to 0.
srd RA,RS,RB
0 1 1 1 1 1 RS RA RB 1 0 0 0 0 1 1 0 1 1 /
0 6 11 16 21 31
n ← GPR(RB)58:63
r ← ROTL64(GPR(RS),64-n)
if GPR(RB)57=0 then m ← MASK(n,63)
else m ← 640
GPR(RA) ← r & m
Let the shift count n be the value specified by the contents of bits 57:63 of
GPR(RB).
The contents of GPR(RS) are shifted right n bits. Bits shifted out of position 63 are
lost. Zeros are supplied to the vacated positions on the left. The result is placed
into GPR(RA).
n ← GPR(RB)59:63
r ← ROTL32(GPR(RS)32:63,64-n)
if GPR(RB)58=0 then m ← MASK(n+32,63)
else m ← 640
result0:63 ← r & m
if Rc=1 then do
LT ← result32:63 < 0
GT ← result32:63 > 0
EQ ← result32:63 = 0
CR0 ← LT || GT || EQ || SO
GPR(RA) ← result0:63
Let the shift count n be the value specified by the contents of bits 58:63 of
GPR(RB).
The contents of bits 32:63 of GPR(RS) are shifted right n bits. Bits shifted out of
position 63 are lost. Zeros are supplied to the vacated positions on the left. The
32-bit result is placed into bits 32:63 of GPR(RA). Bits 0:31 of GPR(RA) are set to
zero.
• For stb and stbu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For stbx and stbux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For stbe and stbue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DE instruction field.
• For stbxe and stbuxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
The contents of bits 56:63 of GPR(RS) are stored into the byte in storage
addressed by EA.
stdcxe. RS,RA,RB
0 1 1 1 1 1 RS RA RB 0 1 1 1 1 1 1 1 1 1 1
0 6 11 16 21 31
Let the effective address (EA) be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
If a reservation exists and the storage address specified by the stdcxe. is the same
as that specified by the ldarxe instruction that established the reservation, the
contents of GPR(RS) is stored into the doubleword in storage addressed by EA and
the reservation is cleared.
If a reservation exists but the storage address specified by the stdcxe. is not the
same as that specified by the ldarxe instruction that established the reservation,
the reservation is cleared, and it is undefined whether the instruction completes
without altering storage.
If a reservation does not exist, the instruction completes without altering storage.
CR Field 0 is set to reflect whether the store operation was performed, as follows.
• For stde and stdue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DES instruction field concatenated
with 0b00.
• For stdxe and stduxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
The contents of GPR(RS) are stored into the doubleword in storage addressed by
EA.
• For stfd and stfdu, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of
the D instruction field.
• For stfdx and stfdux, let EA be 32 0s concatenated with bits 32:63 of the
sum of the contents of GPR(RA), or 64 0s if RA=0, and the contents of
GPR(RB).
• For stfde and stfdue, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the sign-extended value of the DES instruction field concate-
nated with 0b00.
• For stfdxe and stfduxe, let EA be the sum of the contents of GPR(RA), or 64
0s if RA=0, and the contents of GPR(RB).
The contents of FPR(FRS) are stored into the doubleword in storage addressed by
EA.
• For stfiwx, let EA be 32 0s concatenated with bits 32:63 of the sum of the
contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
The contents of bits 32:63 of FPR(FRS) are stored, without conversion, into the
word in storage addressed by EA.
• For stfs and stfsu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For stfsx and stfsux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For stfse and stfsue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DES instruction field concatenated
with 0b00.
• For stfsxe and stfsuxe, let EA be the sum of the contents of GPR(RA), or 64
0s if RA=0, and the contents of GPR(RB).
The contents of FPR(FRS) are converted to single format (see page 100) and stored
into the word in storage addressed by EA.
• For sth and sthu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For sthx and sthux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For sthe and sthue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DE instruction field.
• For sthxe and sthuxe, let EA be the sum of the contents of GPR(RA), or 64 0s
if RA=0, and the contents of GPR(RB).
The contents of bits 48:63 of GPR(RS) are stored into the halfword in storage
addressed by EA.
• For sthbrx, let EA be 32 0s concatenated with bits 32:63 of the sum of the
contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
Bits 56:63 of GPR(RS) are stored into bits 0:7 of the halfword in storage addressed
by EA. Bits 48:55 of GPR(RS) are stored into bits 8:15 of the halfword in storage
addressed by EA.
Programming Note
When EA references Big-Endian storage, these instructions have the effect of storing
data in Little-Endian byte order. Likewise, when EA references Little-Endian storage,
these instructions have the effect of storing data in Big-Endian byte order.
stmw RS,D(RA)
1 0 1 1 1 1 RS RA D
0 6 11 16 31
Let the effective address (EA) be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the D
instruction field.
Let n=(32-RT). Bits 32:63 of registers GPR(RS) through GPR(31) are stored into n
consecutive words in storage starting at address EA.
Engineering Note
Causing an Alignment interrupt if attempt is made to execute a Store Multiple instruc-
tion having an incorrectly aligned effective address facilitates the debugging of software.
Architecture Note
Extended addressing modes are not defined for Store Multiple. Doubleword forms of Store
Multiple are not defined.
stswi RS,RA,NB
0 1 1 1 1 1 RS RA NB 1 0 1 1 0 1 0 1 0 1 /
0 6 11 16 21 31
stswx RS,RA,RB
0 1 1 1 1 1 RS RA RB 1 0 1 0 0 1 0 1 0 1 /
0 6 11 16 21 31
• For stswx, let EA be 32 0s concatenated with bits 32:63 of the sum of the
contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
If stswi then let n=NB if NB≠0, n=32 if NB=0. If stswx then let n=XER57:63. n is
the number of bytes to store. Let nr=CEIL(n÷4): nr is the number of registers to
supply data.
Bytes are stored left to right from each register. The sequence of registers wraps
around to GPR(0) if required.
Programming Note
The Store String Word instructions, in combination with the Load String Word instruc-
tions allow movement of data from storage to registers or from registers to storage
without concern for alignment. These instructions can be used for a short move between
arbitrary storage locations or to initiate a long move between unaligned storage fields.
Architecture Note
Extended addressing modes are not defined for the Store String Word instructions. Dou-
bleword forms of the Store String Word instructions are not defined.
• For stw and stwu, let EA be 32 0s concatenated with bits 32:63 of the sum of
the contents of GPR(RA), or 64 0s if RA=0, and the sign-extended value of the
D instruction field.
• For stwx and stwux, let EA be 32 0s concatenated with bits 32:63 of the sum
of the contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
• For stwe and stwue, let EA be the sum of the contents of GPR(RA), or 64 0s if
RA=0, and the sign-extended value of the DE instruction field.
• For stwxe and stwuxe, let EA be the sum of the contents of GPR(RA), or 64
0s if RA=0, and the contents of GPR(RB).
The contents of bits 32:63 of GPR(RS) are stored into the word in storage
addressed by EA.
• For stwbrx, let EA be 32 0s concatenated with bits 32:63 of the sum of the
contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
Bits 56:63 of GPR(RS) are stored into bits 0:7 of the word in storage addressed by
EA. Bits 48:55 of GPR(RS) are stored into bits 8:15 of the word in storage
addressed by EA. Bits 40:47 of GPR(RS) are stored into bits 16:23 of the word in
storage addressed by EA. Bits 32:39 of GPR(RS) are stored into bits 24:31 of the
word in storage addressed by EA.
Programming Note
When EA references Big-Endian storage, these instructions have the effect of storing
data in Little-Endian byte order. Likewise, when EA references Little-Endian storage,
these instructions have the effect of storing data in Big-Endian byte order.
• For stwcx., let EA be 32 0s concatenated with bits 32:63 of the sum of the
contents of GPR(RA), or 64 0s if RA=0, and the contents of GPR(RB).
If a reservation exists and the storage address specified by the stwcx. or stwcxe.
is the same as that specified by the lwarx or lwarxe instruction that established
the reservation, the contents of bits 32:63 of GPR(RS) are stored into the word in
storage addressed by EA and the reservation is cleared.
If a reservation exists but the storage address specified by the stwcx. or stwcxe.
is not the same as that specified by the Load and Reserve instruction that estab-
lished the reservation, the reservation is cleared, and it is undefined whether the
instruction completes without altering storage.
If a reservation does not exist, the instruction completes without altering storage.
CR Field 0 is set to reflect whether the store operation was performed, as follows.
Programming Note
stwcx., stwcxe., and stdcxe., in combination with lwarx, lwarxe, and ldarxe, permit
the programmer to write a sequence of instructions that appear to perform an atomic
update operation on a storage location. This operation depends upon a single reserva-
tion resource in each processor. At most one reservation exists on any given processor:
there are not separate reservations for words and for doublewords.
Architecture Note
stwcx., stwcxe., and stdcxe. require the EA to be aligned. Software should not attempt
to emulate an unaligned stwcx., stwcxe., or stdcxe., because there is no correct way to
define the address associated with the reservation.
Engineering Note
Causing an Alignment interrupt to be invoked if an attempt is made to execute a stwcx.,
stwcxe., or stdcxe. having an incorrectly aligned effective address facilitates the debug-
ging of software by signalling the exception when and where the exception occurs.
Engineering Note
If a Store Conditional instruction produces an effective address for which a normal Store
would cause a Data Storage, Alignment, or Data TLB Error interrupt, but the processor
does not have the reservation from a Load and Reserve instruction, then it is implemen-
tation-dependent whether a Data Storage, Alignment, or Data TLB Error interrupt
occurs. See User’s Manual for the implementation.
Programming Note
The granularity with which reservations are managed is implementation-dependent.
Therefore the storage to be accessed by stwcx., stwcxe., or stdcxe. should be allocated
by a system library program. Additional information can be found in Section 6.1.6.2 on
page 117.
Programming Note
When correctly used, the Load And Reserve and Store Conditional instructions can pro-
vide an atomic update function for a single aligned word (Load Word And Reserve and
Store Word Conditional) or doubleword (Load Doubleword And Reserve and Store Double-
word Conditional) of storage.
In general, correct use requires that Load Word And Reserve be paired with Store Word
Conditional, and Load Doubleword And Reserve with Store Doubleword Conditional, with
the same storage address specified by both instructions of the pair. The only exception
is that an unpaired Store Word Conditional or Store Doubleword Conditional instruction
to any (scratch) effective address can be used to clear any reservation held by the pro-
cessor. Examples of correct uses of these instructions to emulate primitives such as
‘Fetch and Add’, ‘Test and Set’, and ‘Compare and Swap’ can be found in Section 11 on
page 225.
• The processor holding the reservation executes another Load And Reserve instruc-
tion; this clears the first reservation and establishes a new one.
• The processor holding the reservation executes a Store Conditional instruction to any
address.
• Another processor executes any Store instruction to the address associated with the
reservation.
• Any mechanism, other than the processor holding the reservation, stores to the
address associated with the reservation.
The sum of the one’s complement of the contents of GPR(RA), the contents of
GPR(RB), and 1 is placed into GPR(RT).
The sum of the one’s complement of the contents of GPR(RA), the contents of
GPR(RB), and 1 is placed into GPR(RT).
For subfe[o][.], the sum of the one’s complement of the contents of GPR(RA), the
contents of GPR(RB), and CA is placed into GPR(RT).
For subfe64[o], the sum of the one’s complement of the contents of GPR(RA), the
contents of GPR(RB), and CA64 is placed into GPR(RT).
subfic RT,RA,SI
0 0 1 0 0 0 RT RA SI
0 6 11 16 31
The sum of the one’s complement of the contents of GPR(RA), the sign-extended
value of the SI field, and 1 is placed into GPR(RT).
For subfme[o][.], the sum of the one’s complement of the contents of GPR(RA), CA,
and 641 is placed into GPR(RT).
For subfme64[o], the sum of the one’s complement of the contents of GPR(RA),
CA64, and 641 is placed into GPR(RT).
For subfze[o][.], the sum of the one’s complement of the contents of GPR(RA) and
CA is placed into GPR(RT).
For subfze64[o], the sum of the one’s complement of the contents of GPR(RA) and
CA64 is placed into GPR(RT).
td TO,RA,RB
0 1 1 1 1 1 TO RA RB 0 0 0 1 0 0 0 1 0 0 /
0 6 11 16 21 31
tdi TO,RA,SI
0 0 0 0 1 0 TO RA SI
0 6 11 16 31
a ← GPR(RA)
if ‘td’ then b ← GPR(RB)
if ‘tdi’ then b ← EXTS(SI)
if (a < b) & TO0 then TRAP
if (a > b) & TO1 then TRAP
if (a = b) & TO2 then TRAP
if (a <u b) & TO3 then TRAP
if (a >u b) & TO4 then TRAP
If td, the contents of GPR(RA) are compared with the contents of GPR(RB).
If tdi, the contents of GPR(RA) are compared with the sign-extended value of the
SI field.
If any bit in the TO field is set to 1 and its corresponding condition is met by the
result of the comparison, then the system trap handler is invoked.
Let the virtual address (VA) be the value AS || ProcessID || EA. See Figure 6-2 on
page 128.
The operation performed by this instruction is ordered by the mbar (or msync)
instruction with respect to a subsequent tlbsync instruction executed by the pro-
cessor executing the tlbivax[e] instruction. The operations caused by tlbivax[e]
and tlbsync are ordered by mbar as a set of operations which is independent of
the other sets that mbar orders.
Programming Note
The effects of the invalidation are not guaranteed to be visible to the programming model
until the completion of a context synchronizing operation (see Section 1.12.1 on
page 38).
tlbre
0 1 1 1 1 1 ??? 1 1 1 0 1 1 0 0 1 0 /
0 6 21 31
If the instruction specifies a TLB entry that does not exist, the results are unde-
fined.
Let the virtual address (VA) be the value AS || ProcessID || EA. See Figure 6-2 on
page 128.
tlbsync
0 1 1 1 1 1 /// 1 0 0 0 1 1 0 1 1 0 /
0 6 21 31
The tlbsync instruction provides an ordering function for the effects of all
tlbivax[e] instructions executed by the processor executing the tlbsync instruc-
tion, with respect to the memory barrier created by a subsequent msync instruc-
tion executed by the same processor. Executing a tlbsync instruction ensures
that all of the following will occur.
• All storage accesses by other processors for which the address was translated
using the translations being invalidated, will have been performed with
respect to the processor executing the msync instruction, to the extent
required by the associated Memory Coherence Required attributes, before the
mbar or msync instruction’s memory barrier is created.
The operation performed by this instruction is ordered by the mbar and msync
instructions with respect to preceding tlbivax[e] instructions executed by the pro-
cessor executing the tlbsync instruction. The operations caused by tlbivax[e] and
tlbsync are ordered by mbar as a set of operations, which is independent of the
other sets that mbar orders.
tlbwe
0 1 1 1 1 1 ??? 1 1 1 1 0 1 0 0 1 0 /
0 6 21 31
If the instruction specifies a TLB entry that does not exist, the results are unde-
fined.
Programming Notes
The effects of the update are not guaranteed to be visible to the programming model
until the completion of a context synchronizing operation. See Section 1.12.1 on
page 38.
tw TO,RA,RB
0 1 1 1 1 1 TO RA RB 0 0 0 0 0 0 0 1 0 0 /
0 6 11 16 21 31
twi TO,RA,SI
0 0 0 0 1 1 TO RA SI
0 6 11 16 31
a ← EXTS(GPR(RA)32:63)
if ‘tw’ then b ← EXTS(GPR(RB)32:63)
if ‘twi’ then b ← EXTS(SI)
if (a < b) & TO0 then TRAP
if (a > b) & TO1 then TRAP
if (a = b) & TO2 then TRAP
if (a <u b) & TO3 then TRAP
if (a >u b) & TO4 then TRAP
For tw, the contents of bits 32:63 of GPR(RA) are compared with the contents of
bits 32:63 of GPR(RB).
For twi, the contents of bits 32:63 of GPR(RA) are compared with the sign-
extended value of the SI field.
If any bit in the TO field is set to 1 and its corresponding condition is met by the
result of the comparison, then the system trap handler is invoked.
wrtee RS
0 1 1 1 1 1 RS /// 0 0 1 0 0 0 0 0 1 1 /
0 6 11 21 31
wrteei E
0 1 1 1 1 1 /// E /// 0 0 1 0 1 0 0 0 1 1 /
0 6 16 17 21 31
For wrteei, the value specified in the E field is placed into MSREE.
In addition, alteration of the MSREE bit is effective as soon as the instruction com-
pletes. Thus if MSREE=0 and an External interrupt is pending, executing an
wrtee or wrteei that sets MSREE to 1 will cause the External interrupt to be
taken before the next instruction is executed, if no higher priority exception
exists. (See Section 7.9 on page 178).
Programming Note
wrtee and wrteei are used to provide atomic update of MSREE. Typical usage is:
For xori, the contents of GPR(RS) are XORed with 480 || UI.
For xoris, the contents of GPR(RS) are XORed with 320 || UI || 160.
For xor[.], the contents of GPR(RS) are XORed with the contents of GPR(RB).
Likewise, other than floating-point instructions, all instructions which are defined
to return a 64-bit result shall return only bits 32:63 of the result on a 32-bit Book
E implementation.
Special Purpose Registers (SPRs) are on-chip registers that are architecturally
part of the processor core. They are accessed with the mtspr (page 316) and
mfspr (page 309) instructions. Encodings not listed are reserved for future use or
for use as implementation-specific registers.
In Table B-1, the column ‘SPRN’ (SPR number) lists register numbers, which are
used in the instruction mnemonics.
Special purpose registers control the use of the debug facilities, the timers, the
interrupts, the memory management unit, and other architected processor
resources.
Table B-1 provides a summary of all Special Purpose Registers defined in the
Book E.
Privileged
Defined SPRN
Defined
Defined SPR Name Access Page
SPR
Decimal Binary
Preserved SPRNs are SPRNs that otherwise would be classified as reserved, but
have legacy use which requires their deployment in the Book E to be deferred as
long as possible in order to allow legacy hardware and software to migrate these
legacy Special Purpose Registers to Book E allocated SPRN space.
Preserved SPRN
Preserved SPR
Decimal Binary
PowerPC DSISR 18 00000 1 0010
PowerPC DAR 19 00000 1 0011
PowerPC SDR1 25 00000 1 1001
8xx EIE 80 00010 1 0000
8xx EID 81 00010 1 0001
8xx NRE 82 00010 1 0010
5xx,8xx CMPA 144 00100 1 0000
5xx,8xx CMPB 145 00100 1 0001
5xx,8xx CMPC 146 00100 1 0010
5xx,8xx CMPD 147 00100 1 0011
5xx,8xx ICR 148 00100 1 0100
5xx,8xx DER 149 00100 1 0101
5xx,8xx COUNTA 150 00100 1 0110
5xx,8xx COUNTB 151 00100 1 0111
5xx,8xx CMPE 152 00100 1 1000
5xx,8xx CMPF 153 00100 1 1001
5xx,8xx CMPG 154 00100 1 1010
5xx,8xx CMPH 155 00100 1 1011
5xx,8xx LCTRL1 156 00100 1 1100
5xx,8xx LCTRL2 157 00100 1 1101
5xx,8xx ICTRL 158 00100 1 1110
5xx,8xx BAR 159 00100 1 1111
PowerPC ASR 280 01000 1 1000
PowerPC EAR 282 01000 1 1010
Any SPRN in the range 0x000-0x1FF (0-511) that is not Defined (see Table B-1)
and is not Preserved (see Table B-2) is Reserved.
C.1 Synchronization
This section gives examples of how the Storage Synchronization instructions can
be used to emulate various synchronization primitives and to provide more com-
plex forms of synchronization.
These examples have a common form. After possible initialization, there is a ‘con-
ditional sequence’ that begins with a Load And Reserve instruction, which may be
followed by memory accesses and/or computation that include neither a Load
And Reserve nor a Store Conditional, and ends with a Store Conditional instruction
with the same target address as the initial Load And Reserve. In most of the exam-
ples, failure of the Store Conditional causes a branch back to the Load And
Reserve for a repeated attempt. On the assumption that contention is low, the
conditional branch in the examples is optimized for the case in which the Store
Conditional succeeds, by setting the branch-prediction bit appropriately. These
examples focus on techniques for the correct modification of shared storage loca-
tions: see Note 4 in Section C.1.4, “Notes”, on page 386 for a discussion of how the
retry strategy can affect performance.
The Load And Reserve and Store Conditional instructions depend on the coher-
ence mechanism of the system. Stores to a given location are coherent if they are
serialized in some order, and no processor is able to observe a subset of those
stores as occurring in a conflicting order. See Section 6.1.6.1, “Storage Access
Ordering”, on page 114 , for additional details.
Each load operation, whether ordinary or Load And Reserve, returns a value that
has a well-defined source. The source can be the Store or Store Conditional
instruction that wrote the value, an operation by some other mechanism that
accesses storage (e.g., an I/O device), or the initial state of storage.
The examples deal with words: they can be used for doublewords by changing all
lwarx instructions to ldarxe, all stwcx. instructions to stdcxe., all stw instruc-
tions to std, and all cmp[i] instructions with L=0 to cmp[i] with L=1. lwarx-
stwcx. pairs can also be substituted with lwarxe-stwcxe. pairs.
Programming Note
Because the Storage Synchronization instructions have implementation dependencies
(e.g., the granularity at which reservations are managed), they must be used with care.
The operating system should provide system library programs that use these instruc-
tions to implement the high-level synchronization functions (Test and Set, Compare and
Swap, etc.) needed by application programs. Application programs should use these
library programs, rather than use the Storage Synchronization instructions directly.
The sequences used to emulate the various primitives consist primarily of a loop
using lwarx and stwcx.. No additional synchronization is necessary, because the
stwcx. will fail, setting the EQ bit to 0, if the word loaded by lwarx has changed
before the stwcx. is executed: see Section 6.1.6.2, “Atomic Update Primitives”, on
page 117 for more detail.
The ‘Fetch and No-op’ primitive atomically loads the current value in a word in
storage.
Note:
1. The stwcx., if it succeeds, stores to the target location the same value that
was loaded by the preceding lwarx. While the store is redundant with respect
to the value in the location, its success ensures that the value loaded by the
lwarx was the current value, i.e., that the source of the value loaded by the
lwarx was the last store to the location that preceded the stwcx. in the
coherence order for the location.
The ‘Fetch and Store’ primitive atomically loads and replaces a word in storage.
In this example it is assumed that the address of the word to be loaded and
replaced is in GPR(3), the new value is in GPR(GPR(4), and the old value is
returned in GPR(5).
The ‘Fetch and AND’ primitive atomically ANDs a value into a word in storage.
Note:
This version of the ‘Test and Set’ primitive atomically loads a word from storage,
sets the word in storage to a nonzero value if the value loaded is zero, and sets the
EQ bit of CR Field 0 to indicate whether the value loaded is zero.
The ‘Compare and Swap’ primitive atomically compares a value in a register with
a word in storage, if they are equal stores the value from a second register into the
word in storage, if they are unequal loads the word from storage into the first reg-
ister, and sets the EQ bit of CR Field 0 to indicate the result of the comparison.
Notes:
1. The semantics given for ‘Compare and Swap’ above are based on those of the
IBM System/370 Compare and Swap instruction. Other architectures may
define a Compare and Swap instruction differently.
Because the shared resource must not be accessed until the lock has been set,
the ‘lock’ procedure contains an isync instruction after the bc that checks for the
success of test_and_set. The isync instruction delays all subsequent instructions
until all preceding instructions have completed.
The ‘unlock’ procedure stores a 0 to the lock location. Most applications that use
locking require, for correctness, that if the access to the shared resource includes
stores, the program must execute a msync instruction before releasing the lock.
The msync instruction ensures that the program's modifications will be per-
formed with respect to other processors before the store that releases the lock is
performed with respect to those processors. In this example, the ‘unlock’ proce-
dure begins with a msync for this purpose.
The ‘next element pointer’ from the list element after which the new element is to
be inserted, here called the ‘parent element’, is stored into the new element, so
that the new element points to the next element in the list: this store is performed
unconditionally. Then the address of the new element is conditionally stored into
the parent element, thereby adding the new element to the list.
In this example it is assumed that the address of the parent element is in GPR(3),
the address of the new element is in GPR(4), and the next element pointer is at off-
set 0 from the start of the element. It is also assumed that the next element
pointer of each list element is in a ‘reservation granule’ separate from that of the
next element pointer of all other list elements: see Section 6.1.6.2, “Atomic Update
Primitives”, on page 117.
In the preceding example, if two list elements have next element pointers in the
same reservation granule then, in a multiprocessor, ‘livelock’ can occur. (Livelock
is a state in which processors interact in a way such that no processor makes
progress.)
If it is not possible to allocate list elements such that each element's next element
pointer is in a different reservation granule, then livelock can be avoided by using
the following, more complicated, sequence.
4. The manner in which lwarx and stwcx. are communicated to other proces-
sors and mechanisms, and between levels of the storage subsystem within a
given processor (see Section 6.1.6.2, “Atomic Update Primitives”, on
page 117), is implementation-dependent. In some implementations perfor-
mance may be improved by minimizing looping on a lwarx instruction that
fails to return a desired value. For example, in the ‘Test and Set’ example
shown above, if the programmer wishes to stay in the loop until the word
loaded is zero, he could change the ‘bne- $+12’ to ‘bne- loop’. However, in some
implementations better performance may be obtained by using an ordinary
Load instruction to do the initial checking of the value, as follows.
The examples shown below distinguish between the cases N=2 and N>2. If N=2,
the shift amount may be in the range 0 through 127 (64-bit implementations) or 0
through 63 (32-bit implementations), which are the maximum ranges supported
by the Shift instructions used. However if N>2, the shift amount must be in the
range 0 through 63 (64-bit implementations) or 0 through 31 (32-bit implementa-
tions), in order for the examples to yield the desired result. The specific instance
shown for N>2 is N=3: extending those code sequences to larger N is straightfor-
ward, as is reducing them to the case N=2 when the more stringent restriction on
shift amount is met. For shifts with immediate shift amounts only the case N=3 is
shown, because the more stringent restriction on shift amount is always met.
In the examples it is assumed that GPRs 2 and 3 (and 4) contain the quantity to
be shifted, and that the result is to be placed into the same registers, except for
the immediate left shifts in 64-bit implementations, for which the result is placed
into GPRs 3, 4, and 5. In all cases, for both input and result, the lowest-numbered
register contains the highest-order part of the data and highest-numbered register
contains the lowest-order part. For non-immediate shifts, the shift amount is
assumed to be in GPR(6. For immediate shifts, the shift amount is assumed to be
greater than 0. GPRs 0 and 31 are used as scratch registers.
For N>2, the number of instructions required is 2N–1 (immediate shifts) or 3N–1
(non-immediate shifts).
Shift Left Immediate, N=3 (shift amount < 64) Shift Left Immediate, N=3 (shift amount < 32)
rldicr r5,r4,sh,63-sh rlwinm r2,r2,sh,0,31-sh
rldimi r4,r3,0,sh rlwimi r2,r3,sh,32-sh,31
rldicl r4,r4,sh,0 rlwinm r3,r3,sh,0,31-sh
rldimi r3,r2,0,sh rlwimi r3,r4,sh,32-sh,31
rldicl r3,r3,sh,0 rlwinm r4,r4,sh,0,31-sh
Shift Left, N=2 (shift amount < 128) Shift Left, N=2 (shift amount < 64)
subfic r31,r6,64 subfic r31,r6,32
sld r2,r2,r6 slw r2,r2,r6
srd r0,r3,r31 srw r0,r3,r31
or r2,r2,r0 or r2,r2,r0
addi r31,r6,-64 addi r31,r6,-32
sld r0,r3,r31 slw r0,r3,r31
or r2,r2,r0 or r2,r2,r0
sld r3,r3,r6 slw r3,r3,r6
Shift Left, N=3 (shift amount < 64) Shift Left, N=3 (shift amount < 32)
subfic r31,r6,64 subfic r31,r6,32
sld r2,r2,r6 slw r2,r2,r6
srd r0,r3,r31 srw r0,r3,r31
or r2,r2,r0 or r2,r2,r0
sld r3,r3,r6 slw r3,r3,r6
srd r0,r4,r31 srw r0,r4,r31
or r3,r3,r0 or r3,r3,r0
sld r4,r4,r6 slw r4,r4,r6
Shift Right Immediate, N=3 (shift amount < 64) Shift Right Immediate, N=3 (shift amount < 32)
rldimi r4,r3,0,64-sh rlwinm r4,r4,32-sh,sh,31
rldicl r4,r4,64-sh,0 rlwimi r4,r3,32-sh,0,sh-1
rldimi r3,r2,0,64-sh rlwinm r3,r3,32-sh,sh,31
rldicl r3,r3,64-sh,0 rlwimi r3,r2,32-sh,0,sh-1
rldicl r2,r2,64-sh,sh rlwinm r2,r2,32-sh,sh,31
Shift Right, N=2 (shift amount < 128) Shift Right, N=2 (shift amount < 64)
subfic r31,r6,64 subfic r31,r6,32
srd r3,r3,r6 srw r3,r3,r6
sld r0,r2,r31 slw r0,r2,r31
or r3,r3,r0 or r3,r3,r0
addi r31,r6,-64 addi r31,r6,-32
srd r0,r2,r31 srw r0,r2,r31
or r3,r3,r0 or r3,r3,r0
srd r2,r2,r6 srw r2,r2,r6
Shift Right, N=3 (shift amount < 64) Shift Right, N=3 (shift amount < 32)
subfic r31,r6,64 subfic r31,r6,32
srd r4,r4,r6 srw r4,r4,r6
sld r0,r3,r31 slw r0,r3,r31
or r4,r4,r0 or r4,r4,r0
srd r3,r3,r6 srw r3,r3,r6
sld r0,r2,r31 slw r0,r2,r31
or r3,r3,r0 or r3,r3,r0
srd r2,r2,r6 srw r2,r2,r6
Shift Right Algebraic Immediate, N=3 (shift amnt Shift Right Algebraic Immediate, N=3 (shift amnt
< 64) < 32)
rldimi r4,r3,0,64-sh rlwinm r4,r4,32-sh,sh,31
rldicl r4,r4,64-sh,0 rlwimi r4,r3,32-sh,0,sh-1
rldimi r3,r2,0,64-sh rlwinm r3,r3,32-sh,sh,31
rldicl r3,r3,64-sh,0 rlwimi r3,r2,32-sh,0,sh-1
sradi r2,r2,sh srawi r2,r2,sh
Shift Right Algebraic, N=2 (shift amount < 128) Shift Right Algebraic, N=2 (shift amount < 64)
subfic r31,r6,64 subfic r31,r6,32
srd r3,r3,r6 srw r3,r3,r6
sld r0,r2,r31 slw r0,r2,r31
or r3,r3,r0 or r3,r3,r0
addic. r31,r6,-64 addic. r31,r6,-32
srad r0,r2,r31 sraw r0,r2,r31
bc 4,1,$+8 bc 4,1,$+8
ori r3,r0,0 ori r3,r0,0
srad r2,r2,r6 sraw r2,r2,r6
Shift Right Algebraic, N=3 (shift amount < 64) Shift Right Algebraic, N=3 (shift amount < 32)
subfic r31,r6,64 subfic r31,r6,32
srd r4,r4,r6 srw r4,r4,r6
sld r0,r3,r31 slw r0,r3,r31
or r4,r4,r0 or r4,r4,r0
srd r3,r3,r6 srw r3,r3,r6
sld r0,r2,r31 slw r0,r2,r31
or r3,r3,r0 or r3,r3,r0
srad r2,r2,r6 sraw r2,r2,r6
This section gives examples of how the Floating-Point Conversion instructions can
be used to perform various conversions.
Warning: Some of the examples use the optional fsel instruction. Care must be
taken in using fsel if IEEE compatibility is required, or if the values being tested
can be NaNs or infinities: see Section C.4.4, “Notes”, on page 396.
The full convert to floating-point integer function can be implemented with the
sequence shown below, assuming the floating-point value to be converted is in
FPR(1) and the result is returned in FPR(3).
In a 64-bit implementation
The full convert to signed integer doubleword function can be implemented with
the sequence shown below, assuming the floating-point value to be converted is in
FPR(1), the result is returned in GPR(3), and a doubleword at displacement ‘disp’
from the address in GPR(1) can be used as scratch space.
The full convert to signed integer doubleword function can be implemented with
the sequence shown below, assuming the floating-point value to be converted is in
FPR(1), bits 0:31 of the result are returned in GPR(3), bits 32:63 of the result are
returned in GPR(4), and a doubleword at displacement ‘disp’ from the address in
GPR(1) can be used as scratch space.
In a 64-bit implementation
The full convert to unsigned integer doubleword function can be implemented with
the sequence shown below, assuming the floating-point value to be converted is in
FPR(1), the value 0 is in FPR(0, the value 264–2048 is in FPR(3), the value 263 is in
FPR(4) and GPR(4), the result is returned in GPR(3), and a doubleword at dis-
placement ‘disp’ from the address in GPR(1) can be used as scratch space.
In a 32-bit implementation
Editors' Note
To be supplied.
In a 64-bit implementation
The full convert to unsigned integer word function can be implemented with the
sequence shown below, assuming the floating-point value to be converted is in
FPR(1), the value 0 is in FPR(0, the value 232–1 is in FPR(3), the result is returned
in GPR(3), and a doubleword at displacement ‘disp’ from the address in GPR(1)
can be used as scratch space.
In a 32-bit implementation
The full convert to unsigned integer word function can be implemented with the
sequence shown below, assuming the floating-point value to be converted is in
FPR(1), the value 0 is in FPR(0, the value 232–1 is in FPR(3), the value 231 is in
FPR(4), the result is returned in GPR(3), and a doubleword at displacement ‘disp’
from the address in GPR(1) can be used as scratch space.
The full convert from signed integer doubleword function, using the rounding
mode specified by FPSCRRN, can be implemented with the sequence shown below,
assuming the integer value to be converted is in GPR(3), the result is returned in
FPR(1), and a doubleword at displacement ‘disp’ from the address in GPR(1) can
be used as scratch space.
In a 64-bit implementation
The full convert from unsigned integer doubleword function, using the rounding
mode specified by FPSCRRN, can be implemented with the sequence shown below,
assuming the integer value to be converted is in GPR(3), the value 232 is in FPR(4),
the result is returned in FPR(1), and two doublewords at displacement ‘disp’ from
the address in GPR(1) can be used as scratch space.
In a 32-bit implementation
The full convert from unsigned integer doubleword function, using the rounding
mode specified by FPSCRRN, can be implemented with the sequence shown below,
assuming bits 0:31 of the doubleword integer value to be converted is in GPR(2),
bits 32:63 of the doubleword integer value to be converted is in GPR(3), the value
0 is in GPR(0), the value 232 is in FPR(4), the result is returned in FPR(1), and two
doublewords at displacement ‘disp’ from the address in GPR(1) can be used as
scratch space.
In a 64-bit implementation
The full convert from signed integer word function can be implemented with the
sequence shown below, assuming the integer value to be converted is in GPR(3),
the result is returned in FPR(1), and a doubleword at displacement ‘disp’ from the
address in GPR(1) can be used as scratch space. (The result is exact.)
In a 32-bit implementation
The full convert from signed integer word function can be implemented with the
sequence shown below, assuming the integer value to be converted is in GPR(3),
the result is returned in FPR(1), and a doubleword at displacement ‘disp’ from the
address in GPR(1) can be used as scratch space. (The result is exact.)
In a 64-bit implementation
The full convert from unsigned integer word function can be implemented with the
sequence shown below, assuming the integer value to be converted is in GPR(3),
the result is returned in FPR(1), and a doubleword at displacement ‘disp’ from the
address in GPR(1) can be used as scratch space. (The result is exact.)
In a 32-bit implementation
The full convert from unsigned integer word function can be implemented with the
sequence shown below, assuming the integer value to be converted is in GPR(3), a
value of 0 is in GPR(0), the result is returned in FPR(1), and a doubleword at dis-
placement ‘disp’ from the address in GPR(1) can be used as scratch space. (The
result is exact.)
This section gives examples of how the optional Floating Select instruction can be
used to implement floating-point minimum and maximum functions, and certain
simple forms of if-then-else constructions, without branching.
In these Notes, the ‘optimized program’ is the Book E program shown, and the
‘unoptimized program’ (not shown) is the corresponding Book E program that
uses fcmpu and Branch Conditional instructions instead of fsel.
1. The unoptimized program affects the VXSNAN bit of the FPSCR, and therefore
may cause the system error handler to be invoked if the corresponding
exception is enabled, while the optimized program does not affect this bit.
This property of the optimized program is incompatible with the IEEE
standard.
4. The optimized program gives the incorrect result if a and b are infinities of the
same sign. (Here it is assumed that Invalid Operation Exceptions are dis-
abled, in which case the result of the subtraction is a NaN. The analysis is
more complicated if Invalid Operation Exceptions are enabled, because in that
case the target register of the subtraction is unchanged.)
5. The optimized program affects the OX, UX, XX, and VXISI bits of the FPSCR,
and therefore may cause the system error handler to be invoked if the corre-
sponding exceptions are enabled, while the unoptimized program does not
affect these bits. This property of the optimized program is incompatible with
the IEEE standard.
This appendix gives examples of how dependencies and the msync and mbar
instructions can be used to control storage access ordering when storage is
shared between programs.
In this example it is assumed that the address of the lock is in GPR 3, the value
indicating that the lock is free is in GPR 4, the value to which the lock should be
set is in GPR 5, the old value of the lock is returned in GPR 6, and the address of
the shared data structure is in GPR 9.
The second bc does not complete until CR0 has been set by the stwcx[e]. or std-
cxe.. The stwcx[e]. or stdcxe. does not set CR0 until it has completed (success-
fully or unsuccessfully). The lock is acquired when the stwcx[e]. or stdcxe.
completes successfully. Together, the second bc and the subsequent isync create
an import barrier that prevents the load from ‘data1’ from being performed until
the branch has been resolved not to be taken.
In this example it is assumed that the address of the pointer is in GPR 3, the
value to be added to the pointer is in GPR 4, and the old value of the pointer is
returned in GPR 5.
The load from ‘data1’ cannot be performed until the pointer value has been loaded
into GPR 5 by the lwarx[e] or ldarxe. The load from ‘data1’ may be performed
out-of-order before the stwcx[e]. or stdcxe.. But if the stwcx[e]. or stdcxe. fails,
the branch is taken and the value returned by the load from ‘data1’ is discarded.
If the stwcx[e]. or stdcxe. succeeds, the value returned by the load from ‘data1’ is
valid even if the load is performed out-of-order, because the load uses the pointer
value returned by the instance of the lwarx[e] or ldarxe that created the reserva-
tion used by the successful stwcx[e]. or stdcxe..
In this example it is assumed that the lock is in storage that is Caching Inhibited,
the shared data structure is in storage that is not Caching Inhibited, the address
of the lock is in GPR 3, the value indicating that the lock is free is in GPR 4, and
the address of the shared data structure is in GPR 9.
The msync ensures that the store that releases the lock will not be performed
with respect to any other processor until all stores caused by instructions preced-
ing the msync have been performed with respect to that processor.
In this example it is assumed that both the lock and the shared data structure are
in storage that is neither Caching Inhibited nor Write Through Required, the
address of the lock is in GPR 3, the value indicating that the lock is free is in GPR
4, and the address of the shared data structure is in GPR 9.
Recall that, for storage that is neither Caching Inhibited nor Write Through
Required, mbar orders only stores and has no effect on loads. If the portion of the
program preceding the mbar contains loads from the shared data structure and
the stores to the shared data structure do not depend on the values returned by
those loads, the store that releases the lock could be performed before those
loads. If it is necessary to ensure that those loads are performed before the store
that releases the lock, the programmer can either use the msync instruction as in
Section D.2.1 or use the technique described in Section D.3.
If a load must be performed before a subsequent store (e.g., the store that releases
a lock protecting a shared data structure), a technique similar to the following can
be used.
In this example it is assumed that the address of the storage operand to be loaded
is in GPR 3, the contents of the storage operand are returned in GPR 4, and the
address of the storage operand to be stored is in GPR 5.
The following list identifies the areas in which these optimizations can be made.
1. Receipt of TLB entry invalidate requests from other processors. Since the
design will not be used in SMP systems, this function is not required.
c) Does the design of any I/O subsystem require notification that an msync
is being executed?
Architecture Note
There is a pending proposal for these functions, so this requirement is dependent on
the resolution of that proposal.
Primary
Extended Opcodes
Opcode
0 No preserved extended opcodes
4 No preserved extended opcodes
19 No preserved extended opcodes
31 Extended opcodes (bits 21:30)
210 0b00110_10010 (mtsr)
242 0b00111_10010 (mtsrin)
370 0b01011_10010 (tlbia)
306 0b01001_10010 (tlbie)
Primary
Extended Opcodes
Opcode
0 All instruction encodings (bits 6:31) except 0x0000_0000a
4 All instruction encodings (bits 6:31)
19 Extended opcodes (bits 21:30)
--- 0buuuuu_0u11u
31 Extended opcodes (bits 21:30)
--- 0buuuuu_0u11u
With the exception of the instruction consisting entirely of binary 0s, the reserved
instructions are available for future extensions to Book E: that is, some future
version of Book E may define any of these instructions to perform new functions.
There are two form of reserved instructions, reserved-nop and reserved-illegal
instructions.
This section contains tables showing the Defined primary and extended opcodes
in all members of the Book E family.
Opcode in Opcode in
Decimal Hexadecimal
Instruction
Mnemonic
Applicable Instruction
Machines Format
‘Applicable Machines’ identifies the Book E family members that recognize the
opcode, encoded as follows:
E Book E only
P PowerPC Architecture ‘classic’ (retained in Book E1)
When instruction names and/or mnemonics differ among the family members,
Book E terminology is used.
1. PowerPC Architecture "classic" instructions not retained in Book E are considered preserved.
0-1 00-01 2-3 02-03 Rotate Left Doubleword Immediate then Clear Left
rldicl rldicr <Reserved>
Rotate Left Doubleword Immediate then Clear Right
P MD P MD <Reserved>
4-5 04-05 6-7 06-07 Rotate Left Doubleword Immediate then Clear
rldic rldimi <Reserved>
Rotate Left Doubleword Immediate then Mask Insert
P P <Reserved>
8 08 9 09 10 0A 11 0B Rotate Left Doubleword then Clear Left
rldcl rldcr Rotate Left Doubleword then Clear Right
<Reserved>
P MD P MD <Reserved>
12 0C 13 0D 14 0E 15 0F <Reserved>
<Reserved>
<Reserved>
<Reserved>
Table G-3. Extended opcodes for primary opcode 58 (instruction bits 28:31)
Table G-4. Extended opcodes for primary opcode 62 (instruction bits 28:31)
00000 00001 00011 00010 00110 00111 00101 00100 01100 01101 01111 01110 01010 01011 01001 01000
0 000 1 001 3 003 2 002 6 006 7 007 5 005 4 004 12 00C 13 00D 15 00F 14 00E 10 00A 11 00B 9 009 8 008
00010 00011 00001 00000
11000 11001 11011 11010 11110 11111 11101 11100 10100 10101 10111 10110 10010 10011 10001 10000
24 018 25 019 27 01B 26 01A 30 01E 31 01F 29 01D 28 01C 20 014 21 015 23 017 22 016 18 012 19 013 17 011 16 010
88 058 89 059 91 05B 90 05A 94 05E 95 05F 93 05D 92 05C 84 054 85 055 87 057 86 056 82 052 83 053 81 051 80 050
216 0D8 217 0D9 219 0DB 218 0DA 222 0DE 223 0DF 221 0DD 220 0DC 212 0D4 213 0D5 215 0D7 214 0D6 210 0D2 211 0D3 209 0D1 208 0D0
184 0B8 185 0B9 187 0BB 186 0BA 190 0BE 191 0BF 189 0BD 188 0BC 180 0B4 181 0B5 183 0B7 182 0B6 178 0B2 179 0B3 177 0B1 176 0B0
152 098 153 099 155 09B 154 09A 158 09E 159 09F 157 09D 156 09C 148 094 149 095 151 097 150 096 146 092 147 093 145 091 144 090
isync
P XL
408 198 409 199 411 19B 410 19A 414 19E 415 19F 413 19D 412 19C 404 194 405 195 407 197 406 196 402 192 403 193 401 191 400 190
504 1F8 505 1F9 507 1FB 506 1FA 510 1FE 511 1FF 509 1FD 508 1FC 500 1F4 501 1F5 503 1F7 502 1F6 498 1F2 499 1F3 497 1F1 496 1F0
472 1D8 473 1D9 475 1DB 474 1DA 478 1DE 479 1DF 477 1DD 476 1DC 468 1D4 469 1D5 471 1D7 470 1D6 466 1D2 467 1D3 465 1D1 464 1D0
344 158 345 159 347 15B 346 15A 350 15E 351 15F 349 15D 348 15C 340 154 341 155 343 157 342 156 338 152 339 153 337 151 336 150
312 138 313 139 315 13B 314 13A 318 13E 319 13F 317 13D 316 13C 308 134 309 135 311 137 310 136 306 132 307 133 305 131 304 130
280 118 281 119 283 11B 282 11A 286 11E 287 11F 285 11D 284 11C 276 114 277 115 279 117 278 116 274 112 275 113 273 111 272 110
792 318 793 319 795 31B 794 31A 798 31E 799 31F 797 31D 796 31C 788 314 789 315 791 317 790 316 786 312 787 313 785 311 784 310
888 378 889 379 891 37B 890 37A 894 37E 895 37F 893 37D 892 37C 884 374 885 375 887 377 886 376 882 372 883 373 881 371 880 370
856 358 857 359 859 35B 858 35A 862 35E 863 35F 861 35D 860 35C 852 354 853 355 855 357 854 356 850 352 851 353 849 351 848 350
984 3D8 985 3D9 987 3DB 986 3DA 990 3DE 991 3DF 989 3DD 988 3DC 980 3D4 981 3D5 983 3D7 982 3D6 978 3D2 979 3D3 977 3D1 976 3D0
952 3B8 953 3B9 955 3BB 954 3BA 958 3BE 959 3BF 957 3BD 956 3BC 948 3B4 949 3B5 951 3B7 950 3B6 946 3B2 947 3B3 945 3B1 944 3B0
920 398 921 399 923 39B 922 39A 926 39E 927 39F 925 39D 924 39C 916 394 917 395 919 397 918 396 914 392 915 393 913 391 912 390
664 298 665 299 667 29B 666 29A 670 29E 671 29F 669 29D 668 29C 660 294 661 295 663 297 662 296 658 292 659 293 657 291 656 290
10110 10111 10101 10100
696 2B8 697 2B9 699 2BB 698 2BA 702 2BE 703 2BF 701 2BD 700 2BC 692 2B4 693 2B5 695 2B7 694 2B6 690 2B2 691 2B3 689 2B1 688 2B0
P A
760 2F8 761 2F9 763 2FB 762 2FA 766 2FE 767 2FF 765 2FD 764 2FC 756 2F4 757 2F5 759 2F7 758 2F6 754 2F2 755 2F3 753 2F1 752 2F0
728 2D8 729 2D9 731 2DB 730 2DA 734 2DE 735 2DF 733 2DD 732 2DC 724 2D4 725 2D5 727 2D7 726 2D6 722 2D2 723 2D3 721 2D1 720 2D0
600 258 601 259 603 25B 602 25A 606 25E 607 25F 605 25D 604 25C 596 254 597 255 599 257 598 256 594 252 595 253 593 251 592 250
10000 10001 10011 10010
632 278 633 279 635 27B 634 27A 638 27E 639 27F 637 27D 636 27C 628 274 629 275 631 277 630 276 626 272 627 273 625 271 624 270
568 238 569 239 571 23B 570 23A 574 23E 575 23F 573 23D 572 23C 564 234 565 235 567 237 566 236 562 232 563 233 561 231 560 230
536 218 537 219 539 21B 538 21A 542 21E 543 21F 541 21D 540 21C 532 214 533 215 535 217 534 216 530 212 531 213 529 211 528 210
bcctre bcctr
E XL P XL
00000 00001 00011 00010 00110 00111 00101 00100 01100 01101 01111 01110 01010 01011 01001 01000
0 000 1 001 3 003 2 002 6 006 7 007 5 005 4 004 12 00C 13 00D 15 00F 14 00E 10 00A 11 00B 9 009 8 008
00010 00011 00001 00000
11000 11001 11011 11010 11110 11111 11101 11100 10100 10101 10111 10110 10010 10011 10001 10000
24 018 25 019 27 01B 26 01A 30 01E 31 01F 29 01D 28 01C 20 014 21 015 23 017 22 016 18 012 19 013 17 011 16 010
856 358 857 359 859 35B 858 35A 862 35E 863 35F 861 35D 860 35C 852 354 853 355 855 357 854 356 850 352 851 353 849 351 848 350
mbar
P X
984 3D8 985 3D9 987 3DB 986 3DA 990 3DE 991 3DF 989 3DD 988 3DC 980 3D4 981 3D5 983 3D7 982 3D6 978 3D2 979 3D3 977 3D1 976 3D0
00000 00001 00011 00010 00110 00111 00101 00100 01100 01101 01111 01110 01010 01011 01001 01000
0 000 1 001 3 003 2 002 6 006 7 007 5 005 4 004 12 00C 13 00D 15 00F 14 00E 10 00A 11 00B 9 009 8 008
00010 00011 00001 00000
11000 11001 11011 11010 11110 11111 11101 11100 10100 10101 10111 10110 10010 10011 10001 10000
24 018 25 019 27 01B 26 01A 30 01E 31 01F 29 01D 28 01C 20 014 21 015 23 017 22 016 18 012 19 013 17 011 16 010
123 07B 122 07A 119 077 115 073 113 071 112 070
219 0DB 218 0DA 215 0D7 211 0D3 209 0D1 208 0D0
187 0BB 186 0BA 183 0B7 179 0B3 177 0B1 176 0B0
155 09B 154 09A 151 097 147 093 145 091 144 090
411 19B 410 19A 407 197 403 193 401 191 400 190
507 1FB 506 1FA 503 1F7 499 1F3 497 1F1 496 1F0
475 1DB 474 1DA 471 1D7 467 1D3 465 1D1 464 1D0
347 15B 346 15A 343 157 339 153 337 151 336 150
315 13B 314 13A 311 137 307 133 305 131 304 130
283 11B 282 11A 279 117 275 113 273 111 272 110
795 31B 794 31A 791 317 787 313 785 311 784 310
891 37B 890 37A 887 377 883 373 881 371 880 370
859 35B 858 35A 855 357 851 353 849 351 848 350
987 3DB 986 3DA 983 3D7 979 3D3 977 3D1 976 3D0
955 3BB 954 3BA 951 3B7 947 3B3 945 3B1 944 3B0
923 39B 922 39A 919 397 915 393 913 391 912 390
667 29B 666 29A 663 297 659 293 657 291 656 290
10110 10111 10101 10100
699 2BB 698 2BA 695 2B7 691 2B3 689 2B1 688 2B0
763 2FB 762 2FA 759 2F7 755 2F3 753 2F1 752 2F0
731 2DB 730 2DA 727 2D7 723 2D3 721 2D1 720 2D0
603 25B 602 25A 599 257 595 253 593 251 592 250
10000 10001 10011 10010
635 27B 634 27A 631 277 627 273 625 271 624 270
571 23B 570 23A 567 237 563 233 561 231 560 230
539 21B 538 21A 535 217 531 213 529 211 528 210
00000 00001 00011 00010 00110 00111 00101 00100 01100 01101 01111 01110 01010 01011 01001 01000
0 000 1 001 3 003 2 002 6 006 7 007 5 005 4 004 12 00C 13 00D 15 00F 14 00E 10 00A 11 00B 9 009 8 008
00010 00011 00001 00000
11000 11001 11011 11010 11110 11111 11101 11100 10100 10101 10111 10110 10010 10011 10001 10000
24 018 25 019 27 01B 26 01A 30 01E 31 01F 29 01D 28 01C 20 014 21 015 23 017 22 016 18 012 19 013 17 011 16 010
120 078 123 07B 115 073 113 071 112 070
216 0D8 219 0DB 211 0D3 209 0D1 208 0D0
184 0B8 187 0BB 179 0B3 177 0B1 176 0B0
152 098 155 09B 147 093 145 091 144 090
408 198 411 19B 403 193 401 191 400 190
504 1F8 507 1FB 499 1F3 497 1F1 496 1F0
472 1D8 475 1DB 467 1D3 465 1D1 464 1D0
344 158 347 15B 339 153 337 151 336 150
312 138 315 13B 307 133 305 131 304 130
280 118 283 11B 275 113 273 111 272 110
792 318 795 31B 787 313 785 311 784 310
888 378 891 37B 883 373 881 371 880 370
856 358 859 35B 851 353 849 351 848 350
984 3D8 987 3DB 979 3D3 977 3D1 976 3D0
952 3B8 955 3BB 947 3B3 945 3B1 944 3B0
920 398 923 39B 915 393 913 391 912 390
664 298 667 29B 659 293 657 291 656 290
10110 10111 10101 10100
696 2B8 699 2BB 691 2B3 689 2B1 688 2B0
760 2F8 763 2FB 755 2F3 753 2F1 752 2F0
728 2D8 731 2DB 723 2D3 721 2D1 720 2D0
600 258 603 25B 595 253 593 251 592 250
10000 10001 10011 10010
632 278 635 27B 627 273 625 271 624 270
568 238 571 23B 563 233 561 231 560 230
536 218 539 21B 531 213 529 211 528 210
Opcode
Format
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
D 000010 ----- ----- - tdi Trap Doubleword Immediate 361
D 000011 ----- ----- - twi Trap Word Immediate 367
D 000111 ----- ----- - mulli Multiply Low Immediate 319
D 001000 ----- ----- - subfic Subtract From Immediate Carrying 358
B 001001 ----- ----0 0 bce Branch Conditional Extended 238
B 001001 ----- ----0 1 bcel Branch Conditional Extended & Link 238
B 001001 ----- ----1 0 bcea Branch Conditional Extended Absolute 238
B 001001 ----- ----1 1 bcela Branch Conditional Extended & Link Absolute 238
D 001010 ----- ----- - cmpli Compare Logical Immediate 242
D 001011 ----- ----- - cmpi Compare Immediate 241
D 001100 ----- ----- - addic Add Immediate Carrying 233
D 001101 ----- ----- - addic. Add Immediate Carrying & record CR 233
D 001110 ----- ----- - addi Add Immediate 232
D 001111 ----- ----- - addis Add Immediate Shifted 232
B 010000 ----- ----0 0 bc Branch Conditional 238
B 010000 ----- ----0 1 bcl Branch Conditional & Link 238
B 010000 ----- ----1 0 bca Branch Conditional Absolute 238
B 010000 ----- ----1 1 bcla Branch Conditional & Link Absolute 238
SC 010001 ///// ////1 / sc System Call 334
I 010010 ----- ----0 0 b Branch 237
I 010010 ----- ----0 1 bl Branch & Link 237
I 010010 ----- ----1 0 ba Branch Absolute 237
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
I 010010 ----- ----1 1 bla Branch & Link Absolute 237
XL 010011 00000 00000 / mcrf Move Condition Register Field 305
XL 010011 00000 10000 0 bclr Branch Conditional to Link Register 240
XL 010011 00000 10000 1 bclrl Branch Conditional to Link Register & Link 240
XL 010011 00000 10001 0 bclre Branch Conditional to Link Register Extended 240
XL 010011 00000 10001 1 bclrel Branch Conditional to Link Register Extended & Link 240
XL 010011 00001 00001 / crnor Condition Register NOR 245
XL 010011 00001 10010 / rfi Return From Interrupt 326
XL 010011 00001 10011 / rfci Return From Critical Interrupt 325
XL 010011 00100 00001 / crandc Condition Register AND with Complement 244
XL 010011 00100 10110 / isync Instruction Synchronize 288
XL 010011 00110 00001 / crxor Condition Register XOR 246
XL 010011 00111 00001 / crnand Condition Register NAND 245
XL 010011 01000 00001 / crand Condition Register AND 244
XL 010011 01001 00001 / creqv Condition Register Equivalent 244
XL 010011 01101 00001 / crorc Condition Register OR with Complement 246
XL 010011 01110 00001 / cror Condition Register OR 245
XL 010011 10000 10000 0 bcctr Branch Conditional to Count Register 239
XL 010011 10000 10000 1 bcctrl Branch Conditional to Count Register & Link 239
XL 010011 10000 10001 0 bcctre Branch Conditional to Count Register Extended 239
XL 010011 10000 10001 1 bcctrel Branch Conditional to Count Register Extended & Link 239
M 010100 ----- ----- 0 rlwimi Rotate Left Word Immed then Mask Insert 331
M 010100 ----- ----- 1 rlwimi. Rotate Left Word Immed then Mask Insert & record CR 331
M 010101 ----- ----- 0 rlwinm Rotate Left Word Immed then AND with Mask 332
M 010101 ----- ----- 1 rlwinm. Rotate Left Word Immed then AND with Mask & record CR 332
I 010110 ----- ----0 0 be Branch Extended 238
I 010110 ----- ----0 1 bel Branch Extended & Link 238
I 010110 ----- ----1 0 bea Branch Extended Absolute 238
I 010110 ----- ----1 1 bela Branch Extended & Link Absolute 238
M 010111 ----- ----- 0 rlwnm Rotate Left Word then AND with Mask 332
M 010111 ----- ----- 1 rlwnm. Rotate Left Word then AND with Mask & record CR 332
D 011000 ----- ----- - ori OR Immediate 324
D 011001 ----- ----- - oris OR Immediate Shifted 324
D 011010 ----- ----- - xori XOR Immediate 369
D 011011 ----- ----- - xoris XOR Immediate Shifted 369
D 011100 ----- ----- - andi. AND Immediate & record CR 236
D 011101 ----- ----- - andis. AND Immediate Shifted & record CR 236
MD 011110 ----- -000- / rldicl Rotate Left Doubleword Immediate then Clear Left 327
MD 011110 ----- -001- / rldicr Rotate Left Doubleword Immediate then Clear Right 328
MD 011110 ----- -010- / rldic Rotate Left Doubleword Immediate then Clear 329
MD 011110 ----- -011- / rldimi Rotate Left Doubleword Immediate then Mask Insert 330
MDS 011110 ----- -1000 / rldcl Rotate Left Doubleword then Clear Left 327
MDS 011110 ----- -1001 / rldcr Rotate Left Doubleword then Clear Right 328
X 011111 00000 00000 / cmp Compare 241
X 011111 00000 00100 / tw Trap Word 367
X 011111 00000 01000 0 subfc Subtract From Carrying 356
X 011111 00000 01000 1 subfc. Subtract From Carrying & record CR 356
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 /0000 01001 / mulhdu Multiply High Doubleword Unsigned 317
X 011111 00000 01010 0 addc Add Carrying 230
X 011111 00000 01010 1 addc. Add Carrying & record CR 230
X 011111 /0000 01011 0 mulhwu Multiply High Word Unsigned 318
X 011111 /0000 01011 1 mulhwu. Multiply High Word Unsigned & record CR 318
X 011111 00000 10011 / mfcr Move From Condition Register 307
X 011111 00000 10100 / lwarx Load Word & Reserve Indexed 300
X 011111 00000 10110 / icbt Instruction Cache Block Touch Indexed 287
X 011111 00000 10111 / lwzx Load Word & Zero Indexed 303
X 011111 00000 11000 0 slw Shift Left Word 336
X 011111 00000 11000 1 slw. Shift Left Word & record CR 336
X 011111 00000 11010 0 cntlzw Count Leading Zeros Word 243
X 011111 00000 11010 1 cntlzw. Count Leading Zeros Word & record CR 243
X 011111 00000 11011 / sld Shift Left Doubleword 335
X 011111 00000 11100 0 and AND 236
X 011111 00000 11100 1 and. AND & record CR 236
X 011111 00000 11110 / icbte Instruction Cache Block Touch Indexed Extended 287
X 011111 00000 11111 / lwzxe Load Word & Zero Indexed Extended 303
X 011111 00001 00000 / cmpl Compare Logical 242
X 011111 00001 01000 0 subf Subtract From 355
X 011111 00001 01000 1 subf. Subtract From & record CR 355
X 011111 00001 10110 / dcbst Data Cache Block Store Indexed 251
X 011111 00001 10111 / lwzux Load Word & Zero with Update Indexed 303
X 011111 00001 11010 / cntlzd Count Leading Zeros Doubleword 243
X 011111 00001 11100 0 andc AND with Complement 236
X 011111 00001 11100 1 andc. AND with Complement & record CR 236
X 011111 00001 11110 / dcbste Data Cache Block Store Indexed Extended 251
X 011111 00001 11111 / lwzuxe Load Word & Zero with Update Indexed Extended 303
X 011111 00010 00100 / td Trap Doubleword 361
X 011111 /0010 01001 / mulhd Multiply High Doubleword 317
X 011111 /0010 01011 0 mulhw Multiply High Word 318
X 011111 /0010 01011 1 mulhw. Multiply High Word & record CR 318
X 011111 00010 10011 / mfmsr Move From Machine State Register 308
X 011111 00010 10110 / dcbf Data Cache Block Flush Indexed 248
X 011111 00010 10111 / lbzx Load Byte & Zero Indexed 289
X 011111 00010 11110 / dcbfe Data Cache Block Flush Indexed Extended 248
X 011111 00010 11111 / lbzxe Load Byte & Zero Indexed Extended 289
X 011111 00011 01000 0 neg Negate 322
X 011111 00011 01000 1 neg. Negate & record CR 322
X 011111 00011 10111 / lbzux Load Byte & Zero with Update Indexed 289
X 011111 00011 11100 0 nor NOR 323
X 011111 00011 11100 1 nor. NOR & record CR 323
X 011111 00011 11110 / lwarxe Load Word & Reserve Indexed Extended 300
X 011111 00011 11111 / lbzuxe Load Byte & Zero with Update Indexed Extended 289
X 011111 00100 00011 / wrtee Write External Enable 368
X 011111 00100 01000 0 subfe Subtract From Extended with CA 357
X 011111 00100 01000 1 subfe. Subtract From Extended with CA & record CR 357
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 00100 01010 0 adde Add Extended with CA 231
X 011111 00100 01010 1 adde. Add Extended with CA & record CR 231
XFX 011111 00100 10000 / mtcrf Move To Condition Register Fields 311
X 011111 00100 10010 / mtmsr Move To Machine State Register 315
X 011111 00100 10110 1 stwcx. Store Word Conditional Indexed & record CR 353
X 011111 00100 10111 / stwx Store Word Indexed 351
X 011111 00100 11110 1 stwcxe. Store Word Conditional Indexed Extended & record CR 353
X 011111 00100 11111 / stwxe Store Word Indexed Extended 351
X 011111 00101 00011 / wrteei Write External Enable Immediate 368
X 011111 00101 10111 / stwux Store Word with Update Indexed 351
X 011111 00101 11111 / stwuxe Store Word with Update Indexed Extended 351
X 011111 00110 01000 0 subfze Subtract From Zero Extended with CA 360
X 011111 00110 01000 1 subfze. Subtract From Zero Extended with CA & record CR 360
X 011111 00110 01010 0 addze Add to Zero Extended with CA 235
X 011111 00110 01010 1 addze. Add to Zero Extended with CA & record CR 235
X 011111 00110 10111 / stbx Store Byte Indexed 341
X 011111 00110 11111 / stbxe Store Byte Indexed Extended 341
X 011111 00111 01000 0 subfme Subtract From Minus One Extended with CA 359
X 011111 00111 01000 1 subfme. Subtract From Minus One Extended with CA & record CR 359
X 011111 00111 01001 / mulld Multiply Low Doubleword 319
X 011111 00111 01010 0 addme Add to Minus One Extended with CA 234
X 011111 00111 01010 1 addme. Add to Minus One Extended with CA & record CR 234
X 011111 00111 01011 0 mullw Multiply Low Word 320
X 011111 00111 01011 1 mullw. Multiply Low Word & record CR 320
X 011111 00111 10110 / dcbtst Data Cache Block Touch for Store Indexed 253
X 011111 00111 10111 / stbux Store Byte with Update Indexed 341
X 011111 00111 11110 / dcbtste Data Cache Block Touch for Store Indexed Extended 253
X 011111 00111 11111 / stbuxe Store Byte with Update Indexed Extended 341
X 011111 01000 01010 0 add Add 229
X 011111 01000 01010 1 add. Add & record CR 229
X 011111 01000 10011 / mfapidi Move From APID Indirect 307
X 011111 01000 10110 / dcbt Data Cache Block Touch Indexed 252
X 011111 01000 10111 / lhzx Load Halfword & Zero Indexed 296
X 011111 01000 11100 0 eqv Equivalent 259
X 011111 01000 11100 1 eqv. Equivalent & record CR 259
X 011111 01000 11110 / dcbte Data Cache Block Touch Indexed Extended 252
X 011111 01000 11111 / lhzxe Load Halfword & Zero Indexed Extended 296
X 011111 01001 10111 / lhzux Load Halfword & Zero with Update Indexed 296
X 011111 01001 11100 0 xor XOR 369
X 011111 01001 11100 1 xor. XOR & record CR 369
X 011111 01001 11111 / lhzuxe Load Halfword & Zero with Update Indexed Extended 296
XFX 011111 01010 00011 / mfdcr Move From Device Control Register 307
XFX 011111 01010 10011 / mfspr Move From Special Purpose Register 309
X 011111 01010 10111 / lhax Load Halfword Algebraic Indexed 294
X 011111 01010 11111 / lhaxe Load Halfword Algebraic Indexed Extended 294
X 011111 01011 10111 / lhaux Load Halfword Algebraic with Update Indexed 294
X 011111 01011 11111 / lhauxe Load Halfword Algebraic with Update Indexed Extended 294
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 01100 01000 / subfe64 Subtract From Extended with CA64 357
X 011111 01100 01010 / adde64 Add Extended with CA64 231
X 011111 01100 10111 / sthx Store Halfword Indexed 347
X 011111 01100 11100 0 orc OR with Complement 324
X 011111 01100 11100 1 orc. OR with Complement & record CR 324
X 011111 01100 11111 / sthxe Store Halfword Indexed Extended 347
X 011111 01101 10111 / sthux Store Halfword with Update Indexed 347
X 011111 01101 11100 0 or OR 324
X 011111 01101 11100 1 or. OR & record CR 324
X 011111 01101 11111 / sthuxe Store Halfword with Update Indexed Extended 347
XFX 011111 01110 00011 / mtdcr Move To Device Control Register 311
X 011111 01110 01000 / subfze64 Subtract From Zero Extended with CA64 360
X 011111 01110 01001 / divdu Divide Doubleword Unsigned 256
X 011111 01110 01010 / addze64 Add to Zero Extended with CA64 235
X 011111 01110 01011 0 divwu Divide Word Unsigned 258
X 011111 01110 01011 1 divwu. Divide Word Unsigned & record CR 258
XFX 011111 01110 10011 / mtspr Move To Special Purpose Register 316
X 011111 01110 10110 / dcbi Data Cache Block Invalidate Indexed 249
X 011111 01110 11100 0 nand NAND 321
X 011111 01110 11100 1 nand. NAND & record CR 321
X 011111 01110 11110 / dcbie Data Cache Block Invalidate Indexed Extended 249
X 011111 01110 11111 / ldarxe Load Doubleword & Reserve Indexed Extended 290
X 011111 01111 01000 / subfme64 Subtract From Minus One Extended with CA64 359
X 011111 01111 01001 / divd Divide Doubleword 255
X 011111 01111 01010 / addme64 Add to Minus One Extended with CA64 234
X 011111 01111 01011 0 divw Divide Word 257
X 011111 01111 01011 1 divw. Divide Word & record CR 257
X 011111 01111 11111 1 stdcxe. Store Doubleword Conditional Indexed Extended 342
X 011111 10000 00000 / mcrxr Move to Condition Register from XER 306
X 011111 10000 01000 0 subfco Subtract From Carrying & record OV 356
X 011111 10000 01000 1 subfco. Subtract From Carrying & record OV & CR 356
X 011111 10000 01010 0 addco Add Carrying & record OV 230
X 011111 10000 01010 1 addco. Add Carrying & record OV & CR 230
X 011111 10000 10101 / lswx Load String Word Indexed 298
X 011111 10000 10110 / lwbrx Load Word Byte-Reverse Indexed 302
X 011111 10000 10111 / lfsx Load Floating-Point Single Indexed 293
X 011111 10000 11000 0 srw Shift Right Word 340
X 011111 10000 11000 1 srw. Shift Right Word & record CR 340
X 011111 10000 11011 / srd Shift Right Doubleword 339
X 011111 10000 11110 / lwbrxe Load Word Byte-Reverse Indexed Extended 302
X 011111 10000 11111 / lfsxe Load Floating-Point Single Indexed Extended 293
X 011111 10001 00000 / mcrxr64 Move to Condition Register from XER64 306
X 011111 10001 01000 0 subfo Subtract From & record OV 355
X 011111 10001 01000 1 subfo. Subtract From & record OV & CR 355
X 011111 10001 10110 / tlbsync TLB Synchronize 365
X 011111 10001 10111 / lfsux Load Floating-Point Single with Update Indexed 293
X 011111 10001 11111 / lfsuxe Load Floating-Point Single with Update Indexed Extended 293
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 10010 10101 / lswi Load String Word Immediate 298
X 011111 10010 10110 / msync Memory Synchronize 310
X 011111 10010 10111 / lfdx Load Floating-Point Double Indexed 292
X 011111 10010 11111 / lfdxe Load Floating-Point Double Indexed Extended 292
X 011111 10011 01000 0 nego Negate & record OV 322
X 011111 10011 01000 1 nego. Negate & record OV & record CR 322
X 011111 10011 10111 / lfdux Load Floating-Point Double with Update Indexed 292
X 011111 10011 11111 / lfduxe Load Floating-Point Double with Update Indexed Extended 292
X 011111 10100 01000 0 subfeo Subtract From Extended with CA & record OV 357
X 011111 10100 01000 1 subfeo. Subtract From Extended with CA & record OV & CR 357
X 011111 10100 01010 0 addeo Add Extended with CA & record OV 231
X 011111 10100 01010 1 addeo. Add Extended with CA & record OV & CR 231
X 011111 10100 10101 / stswx Store String Word Indexed 350
X 011111 10100 10110 / stwbrx Store Word Byte-Reverse Indexed 352
X 011111 10100 10111 / stfsx Store Floating-Point Single Indexed 346
X 011111 10100 11110 / stwbrxe Store Word Byte-Reverse Indexed Extended 352
X 011111 10100 11111 / stfsxe Store Floating-Point Single Indexed Extended 346
X 011111 10101 10111 / stfsux Store Floating-Point Single with Update Indexed 346
X 011111 10101 11111 / stfsuxe Store Floating-Point Single with Update Indexed Extended 346
X 011111 10110 01000 0 subfzeo Subtract From Zero Extended with CA & record OV 360
X 011111 10110 01000 1 subfzeo. Subtract From Zero Extended with CA & record OV & CR 360
X 011111 10110 01010 0 addzeo Add to Zero Extended with CA & record OV 235
X 011111 10110 01010 1 addzeo. Add to Zero Extended with CA & record OV & CR 235
X 011111 10110 10101 / stswi Store String Word Immediate 350
X 011111 10110 10111 / stfdx Store Floating-Point Double Indexed 344
X 011111 10110 11111 / stfdxe Store Floating-Point Double Indexed Extended 344
X 011111 10111 01000 0 subfmeo Subtract From Minus One Extended with CA & record OV 359
X 011111 10111 01000 1 subfmeo. Subtract From Minus One Extended with CA & record OV & CR 359
X 011111 10111 01001 / mulldo Multiply Low Doubleword & record OV 319
X 011111 10111 01010 0 addmeo Add to Minus One Extended with CA & record OV 234
X 011111 10111 01010 1 addmeo. Add to Minus One Extended with CA & record OV & CR 234
X 011111 10111 01011 0 mullwo Multiply Low Word & record OV 320
X 011111 10111 01011 1 mullwo. Multiply Low Word & record OV & CR 320
X 011111 10111 10110 / dcba Data Cache Block Allocate Indexed 247
X 011111 10111 10111 / stfdux Store Floating-Point Double with Update Indexed 344
X 011111 10111 11110 / dcbae Data Cache Block Allocate Indexed Extended 247
X 011111 10111 11111 / stfduxe Store Floating-Point Double with Update Indexed Extended 344
X 011111 11000 01010 0 addo Add & record OV 229
X 011111 11000 01010 1 addo. Add & record OV & CR 229
X 011111 11000 10010 / tlbivax TLB Invalidate Virtual Address Indexed 362
X 011111 11000 10011 / tlbivaxe TLB Invalidate Virtual Address Indexed Extended 362
X 011111 11000 10110 / lhbrx Load Halfword Byte-Reverse Indexed 295
X 011111 11000 11000 0 sraw Shift Right Algebraic Word 338
X 011111 11000 11000 1 sraw. Shift Right Algebraic Word & record CR 338
X 011111 11000 11010 / srad Shift Right Algebraic Doubleword 337
X 011111 11000 11110 / lhbrxe Load Halfword Byte-Reverse Indexed Extended 295
X 011111 11000 11111 / ldxe Load Doubleword Indexed Extended 291
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 11001 11000 0 srawi Shift Right Algebraic Word Immediate 338
X 011111 11001 11000 1 srawi. Shift Right Algebraic Word Immediate & record CR 338
XS 011111 11001 1101- / sradi Shift Right Algebraic Doubleword Immediate 337
X 011111 11001 11111 / lduxe Load Doubleword with Update Indexed Extended 291
X 011111 11010 10110 / mbar Memory Barrier 304
X 011111 11100 01000 / subfe64o Subtract From Extended with CA64 & record OV 357
X 011111 11100 01010 / adde64o Add Extended with CA64 & record OV 231
X 011111 11100 10010 ? tlbsx TLB Search Indexed 364
X 011111 11100 10011 ? tlbsxe TLB Search Indexed Extended 364
X 011111 11100 10110 / sthbrx Store Halfword Byte-Reverse Indexed 348
X 011111 11100 11010 0 extsh Extend Sign Halfword 260
X 011111 11100 11010 1 extsh. Extend Sign Halfword & record CR 260
X 011111 11100 11110 / sthbrxe Store Halfword Byte-Reverse Indexed Extended 348
X 011111 11100 11111 / stdxe Store Doubleword Indexed Extended 343
X 011111 11101 10010 / tlbre TLB Read Entry 363
X 011111 11101 11010 0 extsb Extend Sign Byte 260
X 011111 11101 11010 1 extsb. Extend Sign Byte & record CR 260
X 011111 11101 11111 / stduxe Store Doubleword with Update Indexed Extended 343
X 011111 11110 01000 / subfze64o Subtract From Zero Extended with CA64 & record OV 360
X 011111 11110 01001 / divduo Divide Doubleword Unsigned & record OV 256
X 011111 11110 01010 / addze64o Add to Zero Extended with CA64 & record OV 235
X 011111 11110 01011 0 divwuo Divide Word Unsigned & record OV 258
X 011111 11110 01011 1 divwuo. Divide Word Unsigned & record OV & CR 258
X 011111 11110 10010 / tlbwe TLB Write Entry 366
X 011111 11110 10110 / icbi Instruction Cache Block Invalidate Indexed 286
X 011111 11110 10111 / stfiwx Store Floating-Point as Int Word Indexed 345
X 011111 11110 11010 / extsw Extend Sign Word 260
X 011111 11110 11110 / icbie Instruction Cache Block Invalidate Indexed Extended 286
X 011111 11110 11111 / stfiwxe Store Floating-Point as Int Word Indexed Extended 345
X 011111 11111 01000 / subfme64o Subtract From Minus One Extended with CA64 & record OV 359
X 011111 11111 01001 / divdo Divide Doubleword & record OV 255
X 011111 11111 01010 / addme64o Add to Minus One Extended with CA64 & record OV 234
X 011111 11111 01011 0 divwo Divide Word & record OV 257
X 011111 11111 01011 1 divwo. Divide Word & record OV & CR 257
X 011111 11111 10110 / dcbz Data Cache Block set to Zero Indexed 254
X 011111 11111 11110 / dcbze Data Cache Block set to Zero Indexed Extended 254
D 100000 ----- ----- - lwz Load Word & Zero 303
D 100001 ----- ----- - lwzu Load Word & Zero with Update 303
D 100010 ----- ----- - lbz Load Byte & Zero 289
D 100011 ----- ----- - lbzu Load Byte & Zero with Update 289
D 100100 ----- ----- - stw Store Word 351
D 100101 ----- ----- - stwu Store Word with Update 351
D 100110 ----- ----- - stb Store Byte 341
D 100111 ----- ----- - stbu Store Byte with Update 341
D 101000 ----- ----- - lhz Load Halfword & Zero 296
D 101001 ----- ----- - lhzu Load Halfword & Zero with Update 296
D 101010 ----- ----- - lha Load Halfword Algebraic 294
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
D 101011 ----- ----- - lhau Load Halfword Algebraic with Update 294
D 101100 ----- ----- - sth Store Halfword 347
D 101101 ----- ----- - sthu Store Halfword with Update 347
D 101110 ----- ----- - lmw Load Multiple Word 297
D 101111 ----- ----- - stmw Store Multiple Word 349
D 110000 ----- ----- - lfs Load Floating-Point Single 293
D 110001 ----- ----- - lfsu Load Floating-Point Single with Update 293
D 110010 ----- ----- - lfd Load Floating-Point Double 292
D 110011 ----- ----- - lfdu Load Floating-Point Double with Update 292
D 110100 ----- ----- - stfs Store Floating-Point Single 346
D 110101 ----- ----- - stfsu Store Floating-Point Single with Update 346
D 110110 ----- ----- - stfd Store Floating-Point Double 344
D 110111 ----- ----- - stfdu Store Floating-Point Double with Update 344
DE 111010 ----- --000 0 lbze Load Byte & Zero Extended 289
DE 111010 ----- --000 1 lbzue Load Byte & Zero with Update Extended 289
DE 111010 ----- --001 0 lhze Load Halfword & Zero Extended 296
DE 111010 ----- --001 1 lhzue Load Halfword & Zero with Update Extended 296
DE 111010 ----- --010 0 lhae Load Halfword Algebraic Extended 294
DE 111010 ----- --010 1 lhaue Load Halfword Algebraic with Update Extended 294
DE 111010 ----- --011 0 lwze Load Word & Zero Extended 303
DE 111010 ----- --011 1 lwzue Load Word & Zero with Update Extended 303
DE 111010 ----- --100 0 stbe Store Byte Extended 341
DE 111010 ----- --100 1 stbue Store Byte with Update Extended 341
DE 111010 ----- --101 0 sthe Store Halfword Extended 347
DE 111010 ----- --101 1 sthue Store Halfword with Update Extended 347
DE 111010 ----- --111 0 stwe Store Word Extended 351
DE 111010 ----- --111 1 stwue Store Word with Update Extended 351
A 111011 ----- 10010 0 fdivs Floating Divide Single 270
A 111011 ----- 10010 1 fdivs. Floating Divide Single & record CR 270
A 111011 ----- 10100 0 fsubs Floating Subtract Single 285
A 111011 ----- 10100 1 fsubs. Floating Subtract Single & record CR 285
A 111011 ----- 10101 0 fadds Floating Add Single 262
A 111011 ----- 10101 1 fadds. Floating Add Single & record CR 262
A 111011 ----- 10110 0 fsqrts Floating Square Root Single 284
A 111011 ----- 10110 1 fsqrts. Floating Square Root Single & record CR 284
A 111011 ----- 11000 0 fres Floating Reciprocal Estimate Single 278
A 111011 ----- 11000 1 fres. Floating Reciprocal Estimate Single & record CR 278
A 111011 ----- 11001 0 fmuls Floating Multiply Single 274
A 111011 ----- 11001 1 fmuls. Floating Multiply Single & record CR 274
A 111011 ----- 11100 0 fmsubs Floating Multiply-Subtract Single 273
A 111011 ----- 11100 1 fmsubs. Floating Multiply-Subtract Single & record CR 273
A 111011 ----- 11101 0 fmadds Floating Multiply-Add Single 271
A 111011 ----- 11101 1 fmadds. Floating Multiply-Add Single & record CR 271
A 111011 ----- 11110 0 fnmsubs Floating Negative Multiply-Subtract Single 277
A 111011 ----- 11110 1 fnmsubs. Floating Negative Multiply-Subtract Single & record CR 277
A 111011 ----- 11111 0 fnmadds Floating Negative Multiply-Add Single 276
A 111011 ----- 11111 1 fnmadds. Floating Negative Multiply-Add Single & record CR 276
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
DES 111110 ----- --000 0 lde Load Doubleword Extended 291
DES 111110 ----- --000 1 ldue Load Doubleword with Update Extended 291
DES 111110 ----- --010 0 lfse Load Floating-Point Single Extended 293
DES 111110 ----- --010 1 lfsue Load Floating-Point Single with Update Extended 293
DES 111110 ----- --011 0 lfde Load Floating-Point Double Extended 292
DES 111110 ----- --011 1 lfdue Load Floating-Point Double with Update Extended 292
DES 111110 ----- --100 0 stde Store Doubleword Extended 343
DES 111110 ----- --100 1 stdue Store Doubleword with Update Extended 343
DES 111110 ----- --110 0 stfse Store Floating-Point Single Extended 346
DES 111110 ----- --110 1 stfsue Store Floating-Point Single with Update Extended 346
DES 111110 ----- --111 0 stfde Store Floating-Point Double Extended 344
DES 111110 ----- --111 1 stfdue Store Floating-Point Double with Update Extended 344
A 111111 ----- 10010 0 fdiv Floating Divide 270
A 111111 ----- 10010 1 fdiv. Floating Divide & record CR 270
A 111111 ----- 10100 0 fsub Floating Subtract 285
A 111111 ----- 10100 1 fsub. Floating Subtract & record CR 285
A 111111 ----- 10101 0 fadd Floating Add 262
A 111111 ----- 10101 1 fadd. Floating Add & record CR 262
A 111111 ----- 10110 0 fsqrt Floating Square Root 284
A 111111 ----- 10110 1 fsqrt. Floating Square Root & record CR 284
A 111111 ----- 10111 0 fsel Floating Select 283
A 111111 ----- 10111 1 fsel. Floating Select & record CR 283
A 111111 ----- 11001 0 fmul Floating Multiply 274
A 111111 ----- 11001 1 fmul. Floating Multiply & record CR 274
A 111111 ----- 11010 0 frsqrte Floating Reciprocal Square Root Estimate 282
A 111111 ----- 11010 1 frsqrte. Floating Reciprocal Square Root Estimate & record CR 282
A 111111 ----- 11100 0 fmsub Floating Multiply-Subtract 273
A 111111 ----- 11100 1 fmsub. Floating Multiply-Subtract & record CR 273
A 111111 ----- 11101 0 fmadd Floating Multiply-Add 271
A 111111 ----- 11101 1 fmadd. Floating Multiply-Add & record CR 271
A 111111 ----- 11110 0 fnmsub Floating Negative Multiply-Subtract 277
A 111111 ----- 11110 1 fnmsub. Floating Negative Multiply-Subtract & record CR 277
A 111111 ----- 11111 0 fnmadd Floating Negative Multiply-Add 276
A 111111 ----- 11111 1 fnmadd. Floating Negative Multiply-Add & record CR 276
X 111111 00000 00000 / fcmpu Floating Compare Unordered 265
X 111111 00000 01100 0 frsp Floating Round to Single-Precision 279
X 111111 00000 01100 1 frsp. Floating Round to Single-Precision & record CR 279
X 111111 00000 01110 0 fctiw Floating Convert To Int Word 268
X 111111 00000 01110 1 fctiw. Floating Convert To Int Word & record CR 268
X 111111 00000 01111 0 fctiwz Floating Convert To Int Word with round to Zero 268
X 111111 00000 01111 1 fctiwz. Floating Convert To Int Word with round to Zero & record CR 268
X 111111 00001 00000 / fcmpo Floating Compare Ordered 265
X 111111 00001 00110 0 mtfsb1 Move To FPSCR Bit 1 312
X 111111 00001 00110 1 mtfsb1. Move To FPSCR Bit 1 & record CR 312
X 111111 00001 01000 0 fneg Floating Negate 275
X 111111 00001 01000 1 fneg. Floating Negate & record CR 275
X 111111 00010 00000 / mcrfs Move to Condition Register from FPSCR 306
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 111111 00010 00110 0 mtfsb0 Move To FPSCR Bit 0 312
X 111111 00010 00110 1 mtfsb0. Move To FPSCR Bit 0 & record CR 312
X 111111 00010 01000 0 fmr Floating Move Register 272
X 111111 00010 01000 1 fmr. Floating Move Register & record CR 272
X 111111 00100 00110 0 mtfsfi Move To FPSCR Field Immediate 314
X 111111 00100 00110 1 mtfsfi. Move To FPSCR Field Immediate & record CR 314
X 111111 00100 01000 0 fnabs Floating Negative Absolute Value 275
X 111111 00100 01000 1 fnabs. Floating Negative Absolute Value & record CR 275
X 111111 01000 01000 0 fabs Floating Absolute Value 261
X 111111 01000 01000 1 fabs. Floating Absolute Value & record CR 261
X 111111 10010 00111 0 mffs Move From FPSCR 308
X 111111 10010 00111 1 mffs. Move From FPSCR & record CR 308
XFL 111111 10110 00111 0 mtfsf Move To FPSCR Fields 313
XFL 111111 10110 00111 1 mtfsf. Move To FPSCR Fields & record CR 313
X 111111 11001 01110 / fctid Floating Convert To Int Doubleword 266
X 111111 11001 01111 / fctidz Floating Convert To Int Doubleword with round to Zero 266
X 111111 11010 01110 / fcfid Floating Convert From Int Doubleword 263
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
This appendix lists all the instructions in the Book E, in order by mnemonic.
Opcode
Format
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 01000 01010 0 add Add 229
X 011111 01000 01010 1 add. Add & record CR 229
X 011111 00000 01010 0 addc Add Carrying 230
X 011111 00000 01010 1 addc. Add Carrying & record CR 230
X 011111 10000 01010 0 addco Add Carrying & record OV 230
X 011111 10000 01010 1 addco. Add Carrying & record OV & CR 230
X 011111 00100 01010 0 adde Add Extended with CA 231
X 011111 00100 01010 1 adde. Add Extended with CA & record CR 231
X 011111 01100 01010 / adde64 Add Extended with CA64 231
X 011111 11100 01010 / adde64o Add Extended with CA64 & record OV 231
X 011111 10100 01010 0 addeo Add Extended with CA & record OV 231
X 011111 10100 01010 1 addeo. Add Extended with CA & record OV & CR 231
D 001110 ----- ----- - addi Add Immediate 232
D 001100 ----- ----- - addic Add Immediate Carrying 233
D 001101 ----- ----- - addic. Add Immediate Carrying & record CR 233
D 001111 ----- ----- - addis Add Immediate Shifted 232
X 011111 00111 01010 0 addme Add to Minus One Extended with CA 234
X 011111 00111 01010 1 addme. Add to Minus One Extended with CA & record CR 234
X 011111 01111 01010 / addme64 Add to Minus One Extended with CA64 234
X 011111 11111 01010 / addme64o Add to Minus One Extended with CA64 & record OV 234
X 011111 10111 01010 0 addmeo Add to Minus One Extended with CA & record OV 234
X 011111 10111 01010 1 addmeo. Add to Minus One Extended with CA & record OV & CR 234
X 011111 11000 01010 0 addo Add & record OV 229
X 011111 11000 01010 1 addo. Add & record OV & CR 229
X 011111 00110 01010 0 addze Add to Zero Extended with CA 235
X 011111 00110 01010 1 addze. Add to Zero Extended with CA & record CR 235
X 011111 01110 01010 / addze64 Add to Zero Extended with CA64 235
X 011111 11110 01010 / addze64o Add to Zero Extended with CA64 & record OV 235
X 011111 10110 01010 0 addzeo Add to Zero Extended with CA & record OV 235
X 011111 10110 01010 1 addzeo. Add to Zero Extended with CA & record OV & CR 235
X 011111 00000 11100 0 and AND 236
X 011111 00000 11100 1 and. AND & record CR 236
X 011111 00001 11100 0 andc AND with Complement 236
X 011111 00001 11100 1 andc. AND with Complement & record CR 236
D 011100 ----- ----- - andi. AND Immediate & record CR 236
D 011101 ----- ----- - andis. AND Immediate Shifted & record CR 236
I 010010 ----- ----0 0 b Branch 237
I 010010 ----- ----1 0 ba Branch Absolute 237
B 010000 ----- ----0 0 bc Branch Conditional 238
B 010000 ----- ----1 0 bca Branch Conditional Absolute 238
XL 010011 10000 10000 0 bcctr Branch Conditional to Count Register 239
XL 010011 10000 10001 0 bcctre Branch Conditional to Count Register Extended 239
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
XL 010011 10000 10001 1 bcctrel Branch Conditional to Count Register Extended & Link 239
XL 010011 10000 10000 1 bcctrl Branch Conditional to Count Register & Link 239
B 001001 ----- ----0 0 bce Branch Conditional Extended 238
B 001001 ----- ----1 0 bcea Branch Conditional Extended Absolute 238
B 001001 ----- ----0 1 bcel Branch Conditional Extended & Link 238
B 001001 ----- ----1 1 bcela Branch Conditional Extended & Link Absolute 238
B 010000 ----- ----0 1 bcl Branch Conditional & Link 238
B 010000 ----- ----1 1 bcla Branch Conditional & Link Absolute 238
XL 010011 00000 10000 0 bclr Branch Conditional to Link Register 240
XL 010011 00000 10001 0 bclre Branch Conditional to Link Register Extended 240
XL 010011 00000 10001 1 bclrel Branch Conditional to Link Register Extended & Link 240
XL 010011 00000 10000 1 bclrl Branch Conditional to Link Register & Link 240
I 010110 ----- ----0 0 be Branch Extended 238
I 010110 ----- ----1 0 bea Branch Extended Absolute 238
I 010110 ----- ----0 1 bel Branch Extended & Link 238
I 010110 ----- ----1 1 bela Branch Extended & Link Absolute 238
I 010010 ----- ----0 1 bl Branch & Link 237
I 010010 ----- ----1 1 bla Branch & Link Absolute 237
X 011111 00000 00000 / cmp Compare 241
D 001011 ----- ----- - cmpi Compare Immediate 241
X 011111 00001 00000 / cmpl Compare Logical 242
D 001010 ----- ----- - cmpli Compare Logical Immediate 242
X 011111 00001 11010 / cntlzd Count Leading Zeros Doubleword 243
X 011111 00000 11010 0 cntlzw Count Leading Zeros Word 243
X 011111 00000 11010 1 cntlzw. Count Leading Zeros Word & record CR 243
XL 010011 01000 00001 / crand Condition Register AND 244
XL 010011 00100 00001 / crandc Condition Register AND with Complement 244
XL 010011 01001 00001 / creqv Condition Register Equivalent 244
XL 010011 00111 00001 / crnand Condition Register NAND 245
XL 010011 00001 00001 / crnor Condition Register NOR 245
XL 010011 01110 00001 / cror Condition Register OR 245
XL 010011 01101 00001 / crorc Condition Register OR with Complement 246
XL 010011 00110 00001 / crxor Condition Register XOR 246
X 011111 10111 10110 / dcba Data Cache Block Allocate Indexed 247
X 011111 10111 11110 / dcbae Data Cache Block Allocate Indexed Extended 247
X 011111 00010 10110 / dcbf Data Cache Block Flush Indexed 248
X 011111 00010 11110 / dcbfe Data Cache Block Flush Indexed Extended 248
X 011111 01110 10110 / dcbi Data Cache Block Invalidate Indexed 249
X 011111 01110 11110 / dcbie Data Cache Block Invalidate Indexed Extended 249
X 011111 00001 10110 / dcbst Data Cache Block Store Indexed 251
X 011111 00001 11110 / dcbste Data Cache Block Store Indexed Extended 251
X 011111 01000 10110 / dcbt Data Cache Block Touch Indexed 252
X 011111 01000 11110 / dcbte Data Cache Block Touch Indexed Extended 252
X 011111 00111 10110 / dcbtst Data Cache Block Touch for Store Indexed 253
X 011111 00111 11110 / dcbtste Data Cache Block Touch for Store Indexed Extended 253
X 011111 11111 10110 / dcbz Data Cache Block set to Zero Indexed 254
X 011111 11111 11110 / dcbze Data Cache Block set to Zero Indexed Extended 254
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 01111 01001 / divd Divide Doubleword 255
X 011111 11111 01001 / divdo Divide Doubleword & record OV 255
X 011111 01110 01001 / divdu Divide Doubleword Unsigned 256
X 011111 11110 01001 / divduo Divide Doubleword Unsigned & record OV 256
X 011111 01111 01011 0 divw Divide Word 257
X 011111 01111 01011 1 divw. Divide Word & record CR 257
X 011111 11111 01011 0 divwo Divide Word & record OV 257
X 011111 11111 01011 1 divwo. Divide Word & record OV & CR 257
X 011111 01110 01011 0 divwu Divide Word Unsigned 258
X 011111 01110 01011 1 divwu. Divide Word Unsigned & record CR 258
X 011111 11110 01011 0 divwuo Divide Word Unsigned & record OV 258
X 011111 11110 01011 1 divwuo. Divide Word Unsigned & record OV & CR 258
X 011111 01000 11100 0 eqv Equivalent 259
X 011111 01000 11100 1 eqv. Equivalent & record CR 259
X 011111 11101 11010 0 extsb Extend Sign Byte 260
X 011111 11101 11010 1 extsb. Extend Sign Byte & record CR 260
X 011111 11100 11010 0 extsh Extend Sign Halfword 260
X 011111 11100 11010 1 extsh. Extend Sign Halfword & record CR 260
X 011111 11110 11010 / extsw Extend Sign Word 260
X 111111 01000 01000 0 fabs Floating Absolute Value 261
X 111111 01000 01000 1 fabs. Floating Absolute Value & record CR 261
A 111111 ----- 10101 0 fadd Floating Add 262
A 111111 ----- 10101 1 fadd. Floating Add & record CR 262
A 111011 ----- 10101 0 fadds Floating Add Single 262
A 111011 ----- 10101 1 fadds. Floating Add Single & record CR 262
X 111111 11010 01110 / fcfid Floating Convert From Int Doubleword 263
X 111111 00001 00000 / fcmpo Floating Compare Ordered 265
X 111111 00000 00000 / fcmpu Floating Compare Unordered 265
X 111111 11001 01110 / fctid Floating Convert To Int Doubleword 266
X 111111 11001 01111 / fctidz Floating Convert To Int Doubleword with round to Zero 266
X 111111 00000 01110 0 fctiw Floating Convert To Int Word 268
X 111111 00000 01110 1 fctiw. Floating Convert To Int Word & record CR 268
X 111111 00000 01111 0 fctiwz Floating Convert To Int Word with round to Zero 268
X 111111 00000 01111 1 fctiwz. Floating Convert To Int Word with round to Zero & record CR 268
A 111111 ----- 10010 0 fdiv Floating Divide 270
A 111111 ----- 10010 1 fdiv. Floating Divide & record CR 270
A 111011 ----- 10010 0 fdivs Floating Divide Single 270
A 111011 ----- 10010 1 fdivs. Floating Divide Single & record CR 270
A 111111 ----- 11101 0 fmadd Floating Multiply-Add 271
A 111111 ----- 11101 1 fmadd. Floating Multiply-Add & record CR 271
A 111011 ----- 11101 0 fmadds Floating Multiply-Add Single 271
A 111011 ----- 11101 1 fmadds. Floating Multiply-Add Single & record CR 271
X 111111 00010 01000 0 fmr Floating Move Register 272
X 111111 00010 01000 1 fmr. Floating Move Register & record CR 272
A 111111 ----- 11100 0 fmsub Floating Multiply-Subtract 273
A 111111 ----- 11100 1 fmsub. Floating Multiply-Subtract & record CR 273
A 111011 ----- 11100 0 fmsubs Floating Multiply-Subtract Single 273
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
A 111011 ----- 11100 1 fmsubs. Floating Multiply-Subtract Single & record CR 273
A 111111 ----- 11001 0 fmul Floating Multiply 274
A 111111 ----- 11001 1 fmul. Floating Multiply & record CR 274
A 111011 ----- 11001 0 fmuls Floating Multiply Single 274
A 111011 ----- 11001 1 fmuls. Floating Multiply Single & record CR 274
X 111111 00100 01000 0 fnabs Floating Negative Absolute Value 275
X 111111 00100 01000 1 fnabs. Floating Negative Absolute Value & record CR 275
X 111111 00001 01000 0 fneg Floating Negate 275
X 111111 00001 01000 1 fneg. Floating Negate & record CR 275
A 111111 ----- 11111 0 fnmadd Floating Negative Multiply-Add 276
A 111111 ----- 11111 1 fnmadd. Floating Negative Multiply-Add & record CR 276
A 111011 ----- 11111 0 fnmadds Floating Negative Multiply-Add Single 276
A 111011 ----- 11111 1 fnmadds. Floating Negative Multiply-Add Single & record CR 276
A 111111 ----- 11110 0 fnmsub Floating Negative Multiply-Subtract 277
A 111111 ----- 11110 1 fnmsub. Floating Negative Multiply-Subtract & record CR 277
A 111011 ----- 11110 0 fnmsubs Floating Negative Multiply-Subtract Single 277
A 111011 ----- 11110 1 fnmsubs. Floating Negative Multiply-Subtract Single & record CR 277
A 111011 ----- 11000 0 fres Floating Reciprocal Estimate Single 278
A 111011 ----- 11000 1 fres. Floating Reciprocal Estimate Single & record CR 278
X 111111 00000 01100 0 frsp Floating Round to Single-Precision 279
X 111111 00000 01100 1 frsp. Floating Round to Single-Precision & record CR 279
A 111111 ----- 11010 0 frsqrte Floating Reciprocal Square Root Estimate 282
A 111111 ----- 11010 1 frsqrte. Floating Reciprocal Square Root Estimate & record CR 282
A 111111 ----- 10111 0 fsel Floating Select 283
A 111111 ----- 10111 1 fsel. Floating Select & record CR 283
A 111111 ----- 10110 0 fsqrt Floating Square Root 284
A 111111 ----- 10110 1 fsqrt. Floating Square Root & record CR 284
A 111011 ----- 10110 0 fsqrts Floating Square Root Single 284
A 111011 ----- 10110 1 fsqrts. Floating Square Root Single & record CR 284
A 111111 ----- 10100 0 fsub Floating Subtract 285
A 111111 ----- 10100 1 fsub. Floating Subtract & record CR 285
A 111011 ----- 10100 0 fsubs Floating Subtract Single 285
A 111011 ----- 10100 1 fsubs. Floating Subtract Single & record CR 285
X 011111 11110 10110 / icbi Instruction Cache Block Invalidate Indexed 286
X 011111 11110 11110 / icbie Instruction Cache Block Invalidate Indexed Extended 286
X 011111 00000 10110 / icbt Instruction Cache Block Touch Indexed 287
X 011111 00000 11110 / icbte Instruction Cache Block Touch Indexed Extended 287
XL 010011 00100 10110 / isync Instruction Synchronize 288
D 100010 ----- ----- - lbz Load Byte & Zero 289
DE 111010 ----- --000 0 lbze Load Byte & Zero Extended 289
D 100011 ----- ----- - lbzu Load Byte & Zero with Update 289
DE 111010 ----- --000 1 lbzue Load Byte & Zero with Update Extended 289
X 011111 00011 10111 / lbzux Load Byte & Zero with Update Indexed 289
X 011111 00011 11111 / lbzuxe Load Byte & Zero with Update Indexed Extended 289
X 011111 00010 10111 / lbzx Load Byte & Zero Indexed 289
X 011111 00010 11111 / lbzxe Load Byte & Zero Indexed Extended 289
X 011111 01110 11111 / ldarxe Load Doubleword & Reserve Indexed Extended 290
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
DES 111110 ----- --000 0 lde Load Doubleword Extended 291
DES 111110 ----- --000 1 ldue Load Doubleword with Update Extended 291
X 011111 11001 11111 / lduxe Load Doubleword with Update Indexed Extended 291
X 011111 11000 11111 / ldxe Load Doubleword Indexed Extended 291
D 110010 ----- ----- - lfd Load Floating-Point Double 292
DES 111110 ----- --011 0 lfde Load Floating-Point Double Extended 292
D 110011 ----- ----- - lfdu Load Floating-Point Double with Update 292
DES 111110 ----- --011 1 lfdue Load Floating-Point Double with Update Extended 292
X 011111 10011 10111 / lfdux Load Floating-Point Double with Update Indexed 292
X 011111 10011 11111 / lfduxe Load Floating-Point Double with Update Indexed Extended 292
X 011111 10010 10111 / lfdx Load Floating-Point Double Indexed 292
X 011111 10010 11111 / lfdxe Load Floating-Point Double Indexed Extended 292
D 110000 ----- ----- - lfs Load Floating-Point Single 293
DES 111110 ----- --010 0 lfse Load Floating-Point Single Extended 293
D 110001 ----- ----- - lfsu Load Floating-Point Single with Update 293
DES 111110 ----- --010 1 lfsue Load Floating-Point Single with Update Extended 293
X 011111 10001 10111 / lfsux Load Floating-Point Single with Update Indexed 293
X 011111 10001 11111 / lfsuxe Load Floating-Point Single with Update Indexed Extended 293
X 011111 10000 10111 / lfsx Load Floating-Point Single Indexed 293
X 011111 10000 11111 / lfsxe Load Floating-Point Single Indexed Extended 293
D 101010 ----- ----- - lha Load Halfword Algebraic 294
DE 111010 ----- --010 0 lhae Load Halfword Algebraic Extended 294
D 101011 ----- ----- - lhau Load Halfword Algebraic with Update 294
DE 111010 ----- --010 1 lhaue Load Halfword Algebraic with Update Extended 294
X 011111 01011 10111 / lhaux Load Halfword Algebraic with Update Indexed 294
X 011111 01011 11111 / lhauxe Load Halfword Algebraic with Update Indexed Extended 294
X 011111 01010 10111 / lhax Load Halfword Algebraic Indexed 294
X 011111 01010 11111 / lhaxe Load Halfword Algebraic Indexed Extended 294
X 011111 11000 10110 / lhbrx Load Halfword Byte-Reverse Indexed 295
X 011111 11000 11110 / lhbrxe Load Halfword Byte-Reverse Indexed Extended 295
D 101000 ----- ----- - lhz Load Halfword & Zero 296
DE 111010 ----- --001 0 lhze Load Halfword & Zero Extended 296
D 101001 ----- ----- - lhzu Load Halfword & Zero with Update 296
DE 111010 ----- --001 1 lhzue Load Halfword & Zero with Update Extended 296
X 011111 01001 10111 / lhzux Load Halfword & Zero with Update Indexed 296
X 011111 01001 11111 / lhzuxe Load Halfword & Zero with Update Indexed Extended 296
X 011111 01000 10111 / lhzx Load Halfword & Zero Indexed 296
X 011111 01000 11111 / lhzxe Load Halfword & Zero Indexed Extended 296
D 101110 ----- ----- - lmw Load Multiple Word 297
X 011111 10010 10101 / lswi Load String Word Immediate 298
X 011111 10000 10101 / lswx Load String Word Indexed 298
X 011111 00000 10100 / lwarx Load Word & Reserve Indexed 300
X 011111 00011 11110 / lwarxe Load Word & Reserve Indexed Extended 300
X 011111 10000 10110 / lwbrx Load Word Byte-Reverse Indexed 302
X 011111 10000 11110 / lwbrxe Load Word Byte-Reverse Indexed Extended 302
D 100000 ----- ----- - lwz Load Word & Zero 303
DE 111010 ----- --011 0 lwze Load Word & Zero Extended 303
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
D 100001 ----- ----- - lwzu Load Word & Zero with Update 303
DE 111010 ----- --011 1 lwzue Load Word & Zero with Update Extended 303
X 011111 00001 10111 / lwzux Load Word & Zero with Update Indexed 303
X 011111 00001 11111 / lwzuxe Load Word & Zero with Update Indexed Extended 303
X 011111 00000 10111 / lwzx Load Word & Zero Indexed 303
X 011111 00000 11111 / lwzxe Load Word & Zero Indexed Extended 303
X 011111 11010 10110 / mbar Memory Barrier 304
XL 010011 00000 00000 / mcrf Move Condition Register Field 305
X 111111 00010 00000 / mcrfs Move to Condition Register from FPSCR 306
X 011111 10000 00000 / mcrxr Move to Condition Register from XER 306
X 011111 10001 00000 / mcrxr64 Move to Condition Register from XER64 306
X 011111 01000 10011 / mfapidi Move From APID Indirect 307
X 011111 00000 10011 / mfcr Move From Condition Register 307
XFX 011111 01010 00011 / mfdcr Move From Device Control Register 307
X 111111 10010 00111 0 mffs Move From FPSCR 308
X 111111 10010 00111 1 mffs. Move From FPSCR & record CR 308
X 011111 00010 10011 / mfmsr Move From Machine State Register 308
XFX 011111 01010 10011 / mfspr Move From Special Purpose Register 309
X 011111 10010 10110 / msync Memory Synchronize 310
XFX 011111 00100 10000 / mtcrf Move To Condition Register Fields 311
XFX 011111 01110 00011 / mtdcr Move To Device Control Register 311
X 111111 00010 00110 0 mtfsb0 Move To FPSCR Bit 0 312
X 111111 00010 00110 1 mtfsb0. Move To FPSCR Bit 0 & record CR 312
X 111111 00001 00110 0 mtfsb1 Move To FPSCR Bit 1 312
X 111111 00001 00110 1 mtfsb1. Move To FPSCR Bit 1 & record CR 312
XFL 111111 10110 00111 0 mtfsf Move To FPSCR Fields 313
XFL 111111 10110 00111 1 mtfsf. Move To FPSCR Fields & record CR 313
X 111111 00100 00110 0 mtfsfi Move To FPSCR Field Immediate 314
X 111111 00100 00110 1 mtfsfi. Move To FPSCR Field Immediate & record CR 314
X 011111 00100 10010 / mtmsr Move To Machine State Register 315
XFX 011111 01110 10011 / mtspr Move To Special Purpose Register 316
X 011111 /0010 01001 / mulhd Multiply High Doubleword 317
X 011111 /0000 01001 / mulhdu Multiply High Doubleword Unsigned 317
X 011111 /0010 01011 0 mulhw Multiply High Word 318
X 011111 /0010 01011 1 mulhw. Multiply High Word & record CR 318
X 011111 /0000 01011 0 mulhwu Multiply High Word Unsigned 318
X 011111 /0000 01011 1 mulhwu. Multiply High Word Unsigned & record CR 318
X 011111 00111 01001 / mulld Multiply Low Doubleword 319
X 011111 10111 01001 / mulldo Multiply Low Doubleword & record OV 319
D 000111 ----- ----- - mulli Multiply Low Immediate 319
X 011111 00111 01011 0 mullw Multiply Low Word 320
X 011111 00111 01011 1 mullw. Multiply Low Word & record CR 320
X 011111 10111 01011 0 mullwo Multiply Low Word & record OV 320
X 011111 10111 01011 1 mullwo. Multiply Low Word & record OV & CR 320
X 011111 01110 11100 0 nand NAND 321
X 011111 01110 11100 1 nand. NAND & record CR 321
X 011111 00011 01000 0 neg Negate 322
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 00011 01000 1 neg. Negate & record CR 322
X 011111 10011 01000 0 nego Negate & record OV 322
X 011111 10011 01000 1 nego. Negate & record OV & record CR 322
X 011111 00011 11100 0 nor NOR 323
X 011111 00011 11100 1 nor. NOR & record CR 323
X 011111 01101 11100 0 or OR 324
X 011111 01101 11100 1 or. OR & record CR 324
X 011111 01100 11100 0 orc OR with Complement 324
X 011111 01100 11100 1 orc. OR with Complement & record CR 324
D 011000 ----- ----- - ori OR Immediate 324
D 011001 ----- ----- - oris OR Immediate Shifted 324
XL 010011 00001 10011 / rfci Return From Critical Interrupt 325
XL 010011 00001 10010 / rfi Return From Interrupt 326
MDS 011110 ----- -1000 / rldcl Rotate Left Doubleword then Clear Left 327
MDS 011110 ----- -1001 / rldcr Rotate Left Doubleword then Clear Right 328
MD 011110 ----- -010- / rldic Rotate Left Doubleword Immediate then Clear 329
MD 011110 ----- -000- / rldicl Rotate Left Doubleword Immediate then Clear Left 327
MD 011110 ----- -001- / rldicr Rotate Left Doubleword Immediate then Clear Right 328
MD 011110 ----- -011- / rldimi Rotate Left Doubleword Immediate then Mask Insert 330
M 010100 ----- ----- 0 rlwimi Rotate Left Word Immed then Mask Insert 331
M 010100 ----- ----- 1 rlwimi. Rotate Left Word Immed then Mask Insert & record CR 331
M 010101 ----- ----- 0 rlwinm Rotate Left Word Immed then AND with Mask 332
M 010101 ----- ----- 1 rlwinm. Rotate Left Word Immed then AND with Mask & record CR 332
M 010111 ----- ----- 0 rlwnm Rotate Left Word then AND with Mask 332
M 010111 ----- ----- 1 rlwnm. Rotate Left Word then AND with Mask & record CR 332
SC 010001 ///// ////1 / sc System Call 334
X 011111 00000 11011 / sld Shift Left Doubleword 335
X 011111 00000 11000 0 slw Shift Left Word 336
X 011111 00000 11000 1 slw. Shift Left Word & record CR 336
X 011111 11000 11010 / srad Shift Right Algebraic Doubleword 337
XS 011111 11001 1101- / sradi Shift Right Algebraic Doubleword Immediate 337
X 011111 11000 11000 0 sraw Shift Right Algebraic Word 338
X 011111 11000 11000 1 sraw. Shift Right Algebraic Word & record CR 338
X 011111 11001 11000 0 srawi Shift Right Algebraic Word Immediate 338
X 011111 11001 11000 1 srawi. Shift Right Algebraic Word Immediate & record CR 338
X 011111 10000 11011 / srd Shift Right Doubleword 339
X 011111 10000 11000 0 srw Shift Right Word 340
X 011111 10000 11000 1 srw. Shift Right Word & record CR 340
D 100110 ----- ----- - stb Store Byte 341
DE 111010 ----- --100 0 stbe Store Byte Extended 341
DE 111010 ----- --100 1 stbue Store Byte with Update Extended 341
D 100111 ----- ----- - stbu Store Byte with Update 341
X 011111 00111 10111 / stbux Store Byte with Update Indexed 341
X 011111 00111 11111 / stbuxe Store Byte with Update Indexed Extended 341
X 011111 00110 10111 / stbx Store Byte Indexed 341
X 011111 00110 11111 / stbxe Store Byte Indexed Extended 341
X 011111 01111 11111 1 stdcxe. Store Doubleword Conditional Indexed Extended 342
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
DES 111110 ----- --100 0 stde Store Doubleword Extended 343
DES 111110 ----- --100 1 stdue Store Doubleword with Update Extended 343
X 011111 11101 11111 / stduxe Store Doubleword with Update Indexed Extended 343
X 011111 11100 11111 / stdxe Store Doubleword Indexed Extended 343
D 110110 ----- ----- - stfd Store Floating-Point Double 344
DES 111110 ----- --111 0 stfde Store Floating-Point Double Extended 344
D 110111 ----- ----- - stfdu Store Floating-Point Double with Update 344
DES 111110 ----- --111 1 stfdue Store Floating-Point Double with Update Extended 344
X 011111 10111 10111 / stfdux Store Floating-Point Double with Update Indexed 344
X 011111 10111 11111 / stfduxe Store Floating-Point Double with Update Indexed Extended 344
X 011111 10110 10111 / stfdx Store Floating-Point Double Indexed 344
X 011111 10110 11111 / stfdxe Store Floating-Point Double Indexed Extended 344
X 011111 11110 10111 / stfiwx Store Floating-Point as Int Word Indexed 345
X 011111 11110 11111 / stfiwxe Store Floating-Point as Int Word Indexed Extended 345
D 110100 ----- ----- - stfs Store Floating-Point Single 346
DES 111110 ----- --110 0 stfse Store Floating-Point Single Extended 346
D 110101 ----- ----- - stfsu Store Floating-Point Single with Update 346
DES 111110 ----- --110 1 stfsue Store Floating-Point Single with Update Extended 346
X 011111 10101 10111 / stfsux Store Floating-Point Single with Update Indexed 346
X 011111 10101 11111 / stfsuxe Store Floating-Point Single with Update Indexed Extended 346
X 011111 10100 10111 / stfsx Store Floating-Point Single Indexed 346
X 011111 10100 11111 / stfsxe Store Floating-Point Single Indexed Extended 346
D 101100 ----- ----- - sth Store Halfword 347
X 011111 11100 10110 / sthbrx Store Halfword Byte-Reverse Indexed 348
X 011111 11100 11110 / sthbrxe Store Halfword Byte-Reverse Indexed Extended 348
DE 111010 ----- --101 0 sthe Store Halfword Extended 347
D 101101 ----- ----- - sthu Store Halfword with Update 347
DE 111010 ----- --101 1 sthue Store Halfword with Update Extended 347
X 011111 01101 10111 / sthux Store Halfword with Update Indexed 347
X 011111 01101 11111 / sthuxe Store Halfword with Update Indexed Extended 347
X 011111 01100 10111 / sthx Store Halfword Indexed 347
X 011111 01100 11111 / sthxe Store Halfword Indexed Extended 347
D 101111 ----- ----- - stmw Store Multiple Word 349
X 011111 10110 10101 / stswi Store String Word Immediate 350
X 011111 10100 10101 / stswx Store String Word Indexed 350
D 100100 ----- ----- - stw Store Word 351
X 011111 10100 10110 / stwbrx Store Word Byte-Reverse Indexed 352
X 011111 10100 11110 / stwbrxe Store Word Byte-Reverse Indexed Extended 352
X 011111 00100 10110 1 stwcx. Store Word Conditional Indexed & record CR 353
X 011111 00100 11110 1 stwcxe. Store Word Conditional Indexed Extended & record CR 353
DE 111010 ----- --111 0 stwe Store Word Extended 351
D 100101 ----- ----- - stwu Store Word with Update 351
DE 111010 ----- --111 1 stwue Store Word with Update Extended 351
X 011111 00101 10111 / stwux Store Word with Update Indexed 351
X 011111 00101 11111 / stwuxe Store Word with Update Indexed Extended 351
X 011111 00100 10111 / stwx Store Word Indexed 351
X 011111 00100 11111 / stwxe Store Word Indexed Extended 351
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation
Page
Mnemonic Instruction
Primary Extended
(Inst0:5) (Inst21:31)
X 011111 00001 01000 0 subf Subtract From 355
X 011111 00001 01000 1 subf. Subtract From & record CR 355
X 011111 00000 01000 0 subfc Subtract From Carrying 356
X 011111 00000 01000 1 subfc. Subtract From Carrying & record CR 356
X 011111 10000 01000 0 subfco Subtract From Carrying & record OV 356
X 011111 10000 01000 1 subfco. Subtract From Carrying & record OV & CR 356
X 011111 00100 01000 0 subfe Subtract From Extended with CA 357
X 011111 00100 01000 1 subfe. Subtract From Extended with CA & record CR 357
X 011111 01100 01000 / subfe64 Subtract From Extended with CA64 357
X 011111 11100 01000 / subfe64o Subtract From Extended with CA64 & record OV 357
X 011111 10100 01000 0 subfeo Subtract From Extended with CA & record OV 357
X 011111 10100 01000 1 subfeo. Subtract From Extended with CA & record OV & CR 357
D 001000 ----- ----- - subfic Subtract From Immediate Carrying 358
X 011111 00111 01000 0 subfme Subtract From Minus One Extended with CA 359
X 011111 00111 01000 1 subfme. Subtract From Minus One Extended with CA & record CR 359
X 011111 01111 01000 / subfme64 Subtract From Minus One Extended with CA64 359
X 011111 11111 01000 / subfme64o Subtract From Minus One Extended with CA64 & record OV 359
X 011111 10111 01000 0 subfmeo Subtract From Minus One Extended with CA & record OV 359
X 011111 10111 01000 1 subfmeo. Subtract From Minus One Extended with CA & record OV & CR 359
X 011111 10001 01000 0 subfo Subtract From & record OV 355
X 011111 10001 01000 1 subfo. Subtract From & record OV & CR 355
X 011111 00110 01000 0 subfze Subtract From Zero Extended with CA 360
X 011111 00110 01000 1 subfze. Subtract From Zero Extended with CA & record CR 360
X 011111 01110 01000 / subfze64 Subtract From Zero Extended with CA64 360
X 011111 11110 01000 / subfze64o Subtract From Zero Extended with CA64 & record OV 360
X 011111 10110 01000 0 subfzeo Subtract From Zero Extended with CA & record OV 360
X 011111 10110 01000 1 subfzeo. Subtract From Zero Extended with CA & record OV & CR 360
X 011111 00010 00100 / td Trap Doubleword 361
D 000010 ----- ----- - tdi Trap Doubleword Immediate 361
X 011111 11000 10010 / tlbivax TLB Invalidate Virtual Address Indexed 362
X 011111 11000 10011 / tlbivaxe TLB Invalidate Virtual Address Indexed Extended 362
X 011111 11101 10010 / tlbre TLB Read Entry 363
X 011111 11100 10010 ? tlbsx TLB Search Indexed 364
X 011111 11100 10011 ? tlbsxe TLB Search Indexed Extended 364
X 011111 10001 10110 / tlbsync TLB Synchronize 365
X 011111 11110 10010 / tlbwe TLB Write Entry 366
X 011111 00000 00100 / tw Trap Word 367
D 000011 ----- ----- - twi Trap Word Immediate 367
X 011111 00100 00011 / wrtee Write External Enable 368
X 011111 00101 00011 / wrteei Write External Enable Immediate 368
X 011111 01001 11100 0 xor XOR 369
X 011111 01001 11100 1 xor. XOR & record CR 369
D 011010 ----- ----- - xori XOR Immediate 369
D 011011 ----- ----- - xoris XOR Immediate Shifted 369
Legend:
- Don’t care, usually part of an operand field
/ Reserved bit, invalid instruction form if encoded as 1
? Allocated for implementation-dependent use. See User’ Manual for the implementation