Lecture 04 Assembly II
Lecture 04 Assembly II
1
Procedures
◼ int f1 (int i, int j, int k, int g)
{ :::: callee
return 1;
}
$fp
argument registers
Procedure Frame
return registers
Stack grows
Callee saved registers
Local variables
$fp → $fp →
$sp → $sp →
$fp →
$sp →
Before the procedure call during the procedure call after the procedure call
5
Procedure Calling Convention
◼ Calling Procedure
❑ Step-1: pass the argument
❑ Step-2: save caller-saved registers
❑ Step-3: Execute a jal instruction
::::::::
foo1 () li $a0, 4 # passing argument
{ :::::::: sw $t3, 4($sp) # save $t3
i= i+1; jal foo
x=foo(4); lw $t3, 4($sp) # restore $t3
i=x+i add $t3, $v0, $t3
}
::::::
6
Procedure Calling Convention
◼ Called Procedure subi $sp, $sp, 32
❑ Step-1: establish stack frame sw $ra, 20($sp)
sw $fp, 16($sp)
◼ subi $sp, $sp <frame-size> addi $fp, $sp, 28
❑ Step-2: saved callee saved registers :::::
◼ $ra, $fp,$s0-$s7 :::::
:::::
❑ Step-3: establish frame pointer
◼ Add $fp, $sp, <frame-size>-4 addi $v0, $zero, 1
lw $fp, 16($sp)
◼ On return from a call lw $ra, 20($sp)
❑ Step-1: put returned values in addi $sp, $sp,32
jr $ra
register $v0, $v1.
❑ Step-2: restore callee-saved registers
❑ Step-3: pop the stack
❑ Step-4: return: jr $ra 7
Preserved Registers
0 zero constant 0 16 s0 callee saves
1 at reserved for assembler ...
2 v0 expression evaluation & 23 s7
3 v1 function results 24 t8 temporary (cont’d)
4 a0 arguments 25 t9
5 a1 26 k0 reserved for OS kernel
6 a2 27 k1
7 a3 28 gp Pointer to global area
8 t0 temporary: caller saves 29 sp Stack pointer
... 30 fp frame pointer
15 t7 31 ra Return Address (HW)
8
Nested Procedures
fact:
int fact (int n)
addi $sp, $sp, -8
{
sw $ra, 4($sp) # save $ra
slti $t0, $a0, 1 # n< 1?
if (n <1) return 1;
beq $t0, $zero, L1 else return (n x fact(n-1));
addi $v0,$zero,1 # return 1 }
addi $sp, $sp, 8 # fix up the stack pointer & return
jr $ra
L1: sw $a0, 0($sp) # save argument $a0
addi $a0,$a0,-1 # n = n-1
jal fact # jal(n-1)
lw $a0, 0($sp) # restore argument $a0
mul $v0, $a0, $v0 # return n x fact(n-1)
lw $ra, 4($sp) # restore $ra
addi $sp, $sp, 8 # restore stack pointer
jr $ra # return to the caller 9
Allocating Space for New Data on the Heap
$sp 7fff fffc hex Stack
◼ Heap
❑ The segment for dynamic data
structures, e.g. linked lists
Dynamic data
$gp 1000 8000 hex
Static data
1000 0000 hex
Text
pc 0040 0000 hex
Reserved
0 10
Representation of Characters
◼ ASCII (American Standard Code for Information
Interchange)
❑ Uses 8 bits to represent a character
❑ MIPS provides instructions to move bytes:
lb $t0, 0($sp) #Read byte from source
sb $t0, 0($gp) #Write byte to destination
◼ Unicode (Universal Encoding)
❑ Uses 16 bits to represent a character
❑ Used in Java
❑ MIPS provides instructions to move 16 bits:
lh $t0, 0($sp) #Read halfword from source
sh $t0, 0($gp) #Write halfword to destination
11
void strcpy (char x[ ], char y[ ]) {
String Copy Procedure in C int i;
i = 0;
while ((x[i] = y[i]) != ‘\0’)
i+=1;
strcpy: }
addi $sp, $sp, -4 # adjust stack for 1 more item
sw $s0, 0($sp) # save $s0
add $s0, $zero, $zero #i=0
L1: add $t1, $s0, $a1 # address of y[i] in $t1
lb $t2, 0($t1) # t2 = y[i]
add $t3, $s0, $a0 # address of x[i] in $t3
sb $t2, 0($t3) # x[i] = y[i]
beq $t2, $zero, L2 # if y[i]==0, go to L2
addi $s0, $s0, 1 # i= i+1
j L1 # go to L1
L2: lw $s0, 0($sp) # y[i] ==0; end of string, restore old
# $s0
addi $sp, $sp, 4 #pop 1 word off stack
jr $ra #return
12
MIPS Addressing Mode
◼ Addressing mode:
❑ How to indicate where the data is
◼ 5 addressing modes in MIPS
❑ Immediate addressing
❑ Register addressing
❑ Base or displacement addressing
❑ PC-relative addressing
❑ Pseudo-direct addressing
13
MIPS Addressing Mode (1)
◼ Immediate addressing
op rs rt Immediate
14
MIPS Addressing Mode (2)
◼ Register addressing
op rs rt op … funct
Register
◼ Base addressing
op rs rt Address
Register
Method 1.
17
How to Get the Base Address in the Base Register
(cont.)
Method 2.
la $6,xyz #r6 contains the address of xyz
lw $5,0($6) #r5 contains the contents of xyz
18
MIPS Addressing Mode (4)
◼ PC-relative addressing
op rs rt Address
+ Word
PC
◼ Pseudodirect addressing
op Address
: Word
PC
Example : j 100
20
Parallelism and Instructions: Synchronization
Assembler
Linker
Loader
Memory22
Assembler
◼ Assembler
❑ The assembler turns the assembly language program
into an object file.
❑ Symbol table: A table that matches names of labels
to the addresses of the memory words that
instruction occupy.
23
Object file
header
Name Procedure A
Text
100hex
size
Data
20hex
size
0 lw $a0, 0($gp)
lw $a0, x
4 jal 0 jal B
… …
… …
0 lw X
4 jal B
X —
B —
24
Assembler (cont.)
25
Linker (Link editor)
◼ Linker takes all the independently assembled
machine language programs and “stitches”
them together to produce an executable file
that can be run on a computer.
◼ There are three steps for the linker:
1. Place code and data modules symbolically in
memory.
2. Determine the addresses of data and instruction
labels.
3. Patch both the internal and external references.
26
Object file header
Name Procedure A
0 lw $a0, 0($gp)
4 jal 0 Executable
… …
file header
Data segment O (X) Text size 300hex
… …
Data size 50hex
Relocation Instruction
Address Dependency Text segment Address Instruction
information Type
0 sw Y lw $a0, y
4 jal A
jal A
Symbol table Label Address
Y —
A —
27
Loader
◼ Read the executables file header to determine the size of the text
and data segments
◼ Creates an address space large enough for the text and data
◼ Copies the instructions and data from the executable file into
memory
◼ Copies the parameters (if any) to the main program onto the
stack
◼ Initializes the machine registers and sets the stack pointer the
first free location
◼ Jump to a start-up routine
main();
_start_up:
lw a0, offset($sp) ## load arguments
jal main;
exit 28
Dynamically Linked Libraries (DLL)
◼ Disadvantages with traditional statically linked
library
❑ Library updates
❑ Loading the whole library even if all of the library is
not used
◼ Dynamically linked library
❑ The libraries are not linked and loaded until the
program is run.
❑ Lazy procedure linkage
◼ Each routine is linked only after it is called.
29
Dynamic linked library via lazy procedure
linkage
30
Starting a Java Program
34
IA 32: Alternative Architectures
◼ 1978: The Intel 8086 is announced (16 bit architecture)
◼ 1980: The 8087 floating point coprocessor is added
◼ 1982: The 80286 increases address space to 24 bits, +instructions
◼ 1985: The 80386 extends to 32 bits, new addressing modes
◼ 1989-1995: The 80486, Pentium, Pentium Pro add only 4 new instructions
(mostly designed for higher performance)
◼ 1997: 57 new “MMX” instructions are added
◼ 1999: The Pentium III added another 70 instructions (SSE)
◼ 2001: Another 144 instructions (SSE2)
◼ 2003: AMD extends the architecture to increase address space to 64 bits,
widens all registers to 64 bits and other changes (AMD64)
◼ 2004: Extended Memory 64 Technology (EM64T) and adds
more media extensions (SSE3)
EAX GPR 0
ECX GPR 1
EDX GPR 2
EBX GPR 3
ESP GPR 4
EBP GPR 5
ESI GPR 6
EDI GPR 7
38
IA-32 instruction Formats
a. JE EIP + displacement
4 4 8
JE Condi- Displacement
tion
b. CALL
8 32
CALL Offset
d. PUSH ESI
5 3
PUSH Reg
40
Reduced Instruction Set Architecture
(RISC)
◼ Instruction set simplicity leads to a faster
machine
❑ efficient pipelining 32-bit fixed format instruction (3
formats)
◼ 3-operand, reg-reg arithmetic instruction
◼ Supporting very few addressing modes for
load/store
❑ displacement
❑ immediate
42
Popular ISAs
43
Why is ARM architecture so popular?
◼ Chip vendors build their own chip set without
designing CPU from scratch
❑ Qualcomm, Broadcomm, MediaTek etc.
❑ Qualcomm & Broadcomm are good at IC designs in
communications
◼ Challenges of using ARM
❑ Very expensive
45
Snapdragon 845
Alternative to ARM – RISC-V
◼ RISC-V
❑ Also based on MIPS
❑ An open source ISA (Instruction Set Architecture)
◼ Vendors do not have to pay for the ISA
❑ Defines its own instruction set
◼ Very similar to MIPS
❑ You can design your own CPU based on RISC-V
❑ https://riscv.org/
❑ Originally for academic, later for starting the business
◼ https://www.sifive.com/about
46
Next Week
◼ Datapath
❑ The HW built for executing the MIPS ISA
48
Array vs. Pointer
Clear1(int array[ ], int size)
{
int I;
for (i=0, i< size; i+= 1)
array[i] = 0;
}
Pointer
51