04 ARM Assembly
04 ARM Assembly
2
Understanding Computer Architecture
• The first step in understanding any computer architecture is
to learn its language:
– The words in a computer’s language are called instructions.
– The computer’s vocabulary is called the instruction set.
• All programs running on a computer use the same
instruction set.
– All applications are eventually compiled into a series of simple
instructions:
• such as add, subtract, and branch.
3
Microarchitecture
• A computer architecture does not define the
underlying hardware implementation.
– Registers, memories, ALUs, and other building blocks
to form a microprocessor is called the
microarchitecture.
PCSrc
Control
MemtoReg
Unit
31:28 MemWrite
Cond
27:26 ALUControl
Op
25:20 ALUSrc
Funct
15:12
Rd ImmSrc
RegWrite
Flags
ALUFlags
RegSrc
0 1 CLK CLK
CLK
19:16
Instr
0 3:0 A RD
Instruction 0 RA2
A2 RD2 0 SrcB Data
Memory 1
15:12 1 Memory
A3 Register WriteData
WD
4 WD3 File
PCPlus8 1
R15
+
PCPlus4 0
+
4
23:0
Extend ExtImm
Result
4
Microarchitecture
• Different microarchitectures may exist for a single
architecture.
– Intel and Advanced Micro Devices (AMD) both sell various
microprocessors belonging to the same x86 architecture.
– They all can run the same programs,
– But they use different underlying hardware
• Offer different trade-offs in performance, price, and power.
• We will explore microarchitecture in the following weeks!
5
ARM Architecture
• Developed in the 1980’s by Advanced RISC Machines
– now called ARM Holdings
• Almost all cell phones and tablets have multiple
ARM processors
–Over 75% of humans use products with an ARM
processor
• Used in servers, cameras, robots, cars, pinball machines, etc.
6
Machine Language
• Computer hardware understands only 1’s and 0’s
– Instructions are encoded as binary numbers in a format
called machine language.
– The ARM architecture represents each instruction as a
32-bit word.
7
Assembly Language
• However, humans consider reading machine language
to be tedious
– We prefer to represent the instructions in a symbolic
format called assembly language.
– Each assembly language instruction specifies the
operation to perform and the operands on which to
operate
8
Instruction: Addition
C Code ARM Assembly Code
a = b + c; ADD a, b, c
9
Instruction: Subtraction
Similar to addition - only mnemonic changes
C Code ARM assembly code
a = b - c; SUB a, b, c
• SUB: mnemonic
• b, c: source operands
• a: destination operand
10
Multiple Instructions
More complex code handled by multiple ARM
instructions
C Code ARM assembly code
a = b + c - d; ADD t, b, c ; t = b + c
SUB a, t, d ; a = t - d
11
Operands
• An instruction operates on operands .
SUB a, b, c
– The variables a, b, and c are all operands.
– But computers operate on 1’ s and 0’ s, not variable names.
• The instructions need a physical location from which to
retrieve the binary data.
• Operands can be stored in
– Registers
– Memory
– Constants stored in the instruction itself (immediates).
12
Registers and Memory
PCSrc
Control
MemtoReg
Unit
31:28 MemWrite
Cond
27:26 ALUControl
Op
25:20 ALUSrc
Funct
15:12
Rd ImmSrc
RegWrite
Flags
ALUFlags
RegSrc
0 1 CLK CLK
CLK
19:16
Instr
ALU
0 3:0 A RD
Instruction 0 RA2
A2 RD2 0 SrcB Data
Memory 1
15:12 1 Memory
A3 Register WriteData
WD
4 WD3 File
PCPlus8 1
R15
+
PCPlus4 0
+
4
23:0
Extend ExtImm
Result
13
Operands: Registers
• Instructions need to access operands quickly so that they
can run fast.
– But operands stored in memory take a long time to retrieve.
• Therefore, most architectures specify a small number of
registers that hold commonly used operands.
• ARM has 16 registers
– Registers are faster than memory
– Each register is 32 bits
– ARM is called a “32-bit architecture” because it operates on 32-
bit data
14
ARM Register Set
• Registers:
– R before number, all capitals
– Example: “R0” or “register zero” or “register R0”
Name Use
R0 Argument / return value / temporary variable
R1-R3 Argument / temporary variables
R4-R11 Saved variables
R12 Temporary variable
R13 (SP) Stack Pointer
R14 (LR) Link Register
R15 (PC) Program Counter 15
Instructions with Registers
Revisit ADD instruction
16
Operands: Constants\Immediates
• Many instructions can use constants or
immediate operands
• For example: ADD and SUB
• Value is immediately available from instruction
17
Generating Constants
Generating small constants using move (MOV):
18
Operands: Memory
• If registers were the only storage space for operands
– Simple programs with no more than 15 variables.
• However, data can also be stored in memory.
– Whereas the register file is small and fast, memory is larger and
slower.
– For this reason, frequently used variables are kept in registers.
• In the ARM architecture, instructions operate exclusively
on registers
– so data stored in memory must be moved to a register before it
can be processed.
19
Byte-Addressable Memory
• ARM uses a byte-addressable memory.
• Each data byte has unique address
– 32-bit word = 4 bytes, so word address increments by 4
20
Reading Memory
• Memory read called load
– Mnemonic: load register (LDR)
– Format:
LDR R0, [R1, #12]
– Address calculation:
• add base address (R1) to the offset (12)
• address = (R1 + 12)
– Result:
• R0 holds the data at memory address (R1 + 12)
22
Writing Memory
• Memory write are called stores
– Mnemonic: store register (STR)
• Example: Store the value held in R7 into memory
word 21.
– Memory address = 4 x 21 = 84 = 0x54
23
Recap: Accessing Memory
• How to number bytes within a word?
– Little-endian: byte numbers start at the little (least
significant) end
– Big-endian: byte numbers start at the big (most significant)
end
24
Big-Endian & Little-Endian Example
Suppose R2 and R5 hold the values 8 and 0x23456789
• After following code runs on big-endian system, what value
is in R7?
• In a little-endian system?
STR R5, [R2, #0]
LDRB R7, [R2, #1]
Big-Endian Little-Endian
Word
Byte Address 8 9 A B Address B A 9 8 Byte Address
Data Value 23 45 67 89 0 23 45 67 89 Data Value
MSB LSB MSB LSB
Big-endian: 0x00000045
Little-endian: 0x00000067 25
Programming
• High-level languages
– e.g., C, Java, Python:
– Written at a more abstract level than assembly
• Many high-level languages use common software
constructs
– such as arithmetic and logical operations
– conditional execution, if/else statements
– for and while loops
– array indexing
– function calls.
26
Data-processing Instructions
• Logical operations
• Shifts / rotate
• Multiplication
27
Logical Instructions
• These each operate bitwise on two sources and write the
result to a destination register.
– The first source is always a register and the second source is
either an immediate or another register.
• AND
• ORR
• EOR (XOR)
• BIC (Bit Clear)
• MVN (MoVe and NOT)
28
Logical Instructions: Examples
29
Logical Instructions: Uses
• AND or BIC: useful for masking bits
Example: Masking all but the least significant byte of a
value
0xF234012F AND 0x000000FF = 0x0000002F
0xF234012F BIC 0xFFFFFF00 = 0x0000002F
30
Shift Instructions
• LSL: logical shift left
Example: LSL R0, R7, #5 ; R0 = R7 << 5
31
Shift Instructions: Example 1
• Immediate shift amount (5-bit immediate)
• Shift amount: 0-31
32
Shift Instructions: Example 2
33
Multiplication
• MUL: 32 × 32 multiplication, 32-bit result
MUL R1, R2, R3
Result: R1 = (R2 x R3)31:0
35
ARM Condition Flags
Flag Name Description
N Negative Instruction result is negative
Z Zero Instruction results in zero
C Carry Instruction causes an unsigned carry out
V oVerflow Instruction causes an overflow
• Set by ALU
• Held in Current Program Status Register (CPSR)
36
Review: ARM ALU
37
Review: ALU
PCSrc
Control
MemtoReg
Unit
31:28 MemWrite
Cond
27:26 ALUControl
Op
25:20 ALUSrc
Funct
15:12
Rd ImmSrc
RegWrite
Flags
ALUFlags
RegSrc
0 1 CLK CLK
CLK
19:16
Instr
ALU
0 3:0 A RD
Instruction 0 RA2
A2 RD2 0 SrcB Data
Memory 1
15:12 1 Memory
A3 Register WriteData
WD
4 WD3 File
PCPlus8 1
R15
+
PCPlus4 0
+
4
23:0
Extend ExtImm
Result
38
Setting the Condition Flags: NZCV
• Compare instruction: CMP
Example: CMP R5,R6
§ Performs: R5 - R6
§ Does not save result
§ Sets flags. If result:
• Is 0, Z=1
• Is negative, N=1
• Causes a carry out, C=1
• Causes a signed overflow, V=1
39
Condition Mnemonics
• Instruction may be conditionally executed based on
the condition flags
• Condition of execution is encoded as a condition
mnemonic appended to the instruction mnemonic
40
Condition Mnemonics
41
Conditional Execution
Example:
CMP R5, R9 ; performs R5-R9
; sets condition flags
42
Branching
• Branches enable out of sequence instruction execution
– ARM use branch instructions to skip over sections of code or
repeat code.
• Types of branches:
– Branch (B)
• branches to another instruction
– Branch and link (BL)
• Both can be conditional or unconditional
43
The Stored Program
• A program usually executes in sequence, with the program
counter (PC) incrementing by 4 after each instruction to
point to the next instruction.
– Recall that instructions are 4 bytes long and ARM is a byte-
addressed architecture.
• Branch instructions change the program counter.
44
Review: Stored Program
PCSrc
Control
MemtoReg
Unit
31:28 MemWrite
Cond
27:26 ALUControl
Op
25:20 ALUSrc
Funct
15:12
Rd ImmSrc
RegWrite
Flags
ALUFlags
RegSrc
0 1 CLK CLK
CLK
19:16
Instr
ALU
0 3:0 A RD
Instruction 0 RA2
A2 RD2 0 SrcB Data
Memory 1
15:12 1 Memory
A3 Register WriteData
WD
4 WD3 File
PCPlus8 1
R15
+
PCPlus4 0
+
4
23:0
Extend ExtImm
Result
45
Unconditional Branching (B)
ARM assembly
MOV R2, #17 ; R2 = 17
B TARGET ; branch to target
ORR R1, R1, #0x4 ; not executed
TARGET
SUB R1, R1, #78 ; R1 = R1 + 78
46
Conditional Branching
ARM Assembly
MOV R0, #4 ; R0 = 4
ADD R1, R0, R0 ; R1 = R0+R0 = 8
CMP R0, R1 ; sets flags with R0-R1
BEQ THERE ; branch not taken (Z=0)
ORR R1, R1, #1 ; R1 = R1 OR R1 = 9
THERE
ADD R1, R1, 78 ; R1 = R1 + 78 = 87
47
if Statement
L1
f = f – i; SUB R0, R0, R2 ; f = f - i
48
if/else Statement
C Code ARM Assembly Code
;R0=f, R1=g, R2=h, R3=i, R4=j
49
while Loops
50
For Loops
51
Arrays
• Access large amounts of similar data
§ Index: access to each element
§ Size: number of elements
• 5-element array
§ Base address = 0x14000000
(address of first element,
scores[0])
§ Array elements accessed relative
to base address
52
Accessing Arrays
C Code
int array[5];
array[0] = array[0] * 8;
array[1] = array[1] * 8;
54
ACCESSING ARRAYS USING A FOR LOOP
55
Function Calls
Caller:
– passes arguments to callee C Code
void main()
– jumps to callee {
Callee: int y;
y = sum(42, 7);
– performs the function ...
– returns result to caller }
– returns to point of call
int sum(int a, int b)
– must not overwrite registers or {
memory needed by caller return (a + b);
}
56
ARM Function Conventions
• Call Function: branch and link
BL
– it stores the return address of the next instruction in the link
register (LR), and it branches to the target instruction.
• Return from function: move the link register to PC:
MOV PC, LR
• Arguments: R0-R3
• Return value: R0
57
Function Calls
C Code ARM Assembly Code
int main() { 0x00000200 MAIN BL SIMPLE
simple(); 0x00000204 ADD R4, R5, R6
a = b + c; ...
}
59
Input Arguments and Return Value
C Code
int main()
{
int y;
...
y = diffofsums(2, 3, 4, 5); // 4 arguments
...
}
61
Further Reading
62