Lec02 03 Updated
Lec02 03 Updated
Lecture – 2-3
Instruction Set Architecture
2
Amdahl's Law
• Suppose that enhancement E accelerates a fraction F of the task by a
factor S, and the remainder of the task is unaffected
Original Execution
Execution Time
Time of Task after fraction F
Enhanced by factor S
3
Amdahl's Law
Ex Time without E
Speedup (E) =
Ex Time with E
Performance with E
=
Performance without E
4
Amdahl’s Law
5
Amdahl’s Law
1
ExTimeold
Speedupoverall = =
(1 - Fractionenhanced) + Fractionenhanced
ExTimenew
Speedupenhanced
1
Speedupoverall =
(1 - F) + F
S
6
Amdahl’s Law
• Overall speedup if we make 90% of a program run 10 times faster.
F = 0.9, S = 10
1 1
Speedupoverall = = = 5.26
(1- 0.9) + 0.9/10 0.1 + 0.09
1 1
Speedupoverall = = = 1.153
(1- 0.8) + 0.8/1.2 0.2 + 0.66
7
Amdahl’s Law
• The law is often used to predict the potential performance improvement of a system when adding more processors or improving the speed of individual
processors.
1
Speedupoverall =
(1 - P) + P
1 1
Speedupoverall = = = 1.19
(1- 0.2) + 0.2/5 0.8 + 0.04
This means that the overall performance of the system would improve by
about 19% with the addition of the 4 processors.
8
C progra m
A Translation Hierarchy
• High Level Language (HLL)
Com p iler
programs first compiled
(possibly into assembly),
Assem bly la ng ua ge p rogram then linked and finally
loaded into main memory.
Asse m ble r
L in ker
L oa de r
M em ory
9
MIPS R3000 Instruction Set Architecture
(Summary)
° Machine Environment Target
Registers
° Instruction Categories
Load/Store
Computational R0 - R31
Jump and Branch
Floating Point (coprocessor)
PC
HI
LO
3 Instruction Formats: all 32 bits wide
OP
R: Rs Rt Rd sa funct
I: OP Rs Rt Immediate
J: OP jump target
10
Review C Operators/Operands
• Operators: +, -, *, /, % (mod); (7/4==1, 7%4==3)
• Operands:
– Variables: fahr, celsius
– Constants: 0, 1000, -17, 15.4
11
Assembly
Operators
° Syntax of Assembly Operator
1) operation by name “Mnemonics''
2) operand getting result Register or Memory
3) 1st operand for operation
4) 2nd operand for operation
12
Assembly
Operators/Instructions
° MIPS Assembly Syntax is rigid:
1 operation, 3 variables
Why? Keep Hardware simple via regularity
13
Compilati
on
° How to turn the notation that programmers prefer into notation
computer understands?
° Easy:
add a, b, c
sub d, a, e
14
Compilatio
° Example: compile by handnthis
2 C code:
f = (g + h) - (i + j);
15
Compilation --
Summary
° C statement (5 operands, 3 operators):
f = (g + h) - (i + j);
° Becomes 3 assembly instructions
(6 unique operands, 3 operators):
add f,g,h # f contains g+h
add t1,i,j # t1 contains i+j
sub f,f,t1 # f=(g+h)-(i+j)
° In general, each line of C produces many assembly instructions
One reason why people program in C vs. Assembly;
fewer lines of code
Other reasons? (many!)
16
Assembly Design: Key
Concepts
• Assembly language is essentially directly supported
in hardware, therefore ...
17
Instruction Set
Architecture – ISA
• Our focus in couple of lectures will be the Instruction Set
Architecture – ISA which is the interface between the
hardware-software
• It plays a vital role in understanding the computer
architecture from any of the above mentioned perspectives
• The design of hardware and software can’t be initiated
without defining ISA
• It describes the instruction word format and identifies the
memory addressing for data manipulation and control
operations
18
Taxonomy of Instruction Set
• Major advances in computer architecture are typically
associated with landmark instruction set designs – stack,
accumulator, general purpose register etc.
• Design decisions must take into account:
– technology
– machine organization
– programming languages
– compiler technology
– operating systems
19
Taxonomy of Instruction Set ….. Cont’d
• Basic Differentiator: The type of internal storage of
the operand
• Major Choices of ISA:
– Stack Architecture:
– Accumulator Architecture
– General Purpose Register Architecture
Register – memory
Register – Register (load/store)
Memory – Memory Architecture (Obsolete)
20
Stack Architecture
Processor
• Both the operands are
implicitly on the TOS TOS
• Thus, it is also referred to as
Zero-Address machine
• The operand may be either
an input (orange shade) or
result from the ALU (yellow ALU
shade)
• All operands are implicit
(implied or inherited)
• The first operand is removed ....
from the stack and the Memory
second operand is replaced
by the result
....
21
Stack Architecture
To execute: C=A+B
ADD instruction has implicit TOS
operands for the stack –
operands are written in the
stack using PUSH
instruction
PUSH A
PUSH B ALU
ADD
POP C
22
Accumulator Architecture
• An accumulator is a special
Processor
register within the CPU that
serves both as the implicit source
of one operand and as the result
destination for arithmetic and
logic operations.
• Thus, it accumulates or collect
data and doesn’t serve as an
address register at any time ALU
• Limited number of accumulators -
usually only one – are used
• The second operand is in the
memory, thus accumulator based
machines are also called 1- Memory
....
address machines
• They are useful when memory is
expensive or when a limited
number of addressing modes is to
be used ....
23
Accumulator Architecture
To execute: C=A+B
ADD instruction has implicit
operand A for the accumulator,
written using LOAD instruction;
and the second operand B is in
memory at address B ALU
Load A
ADD B
Store C
.
.
24
General Purpose Register Architecture
• Many general purpose registers are available within CPU
• Generally, CPU registers do not have dedicated functions
and can be used for a variety of purposes – address, data
and control
• A relatively small number of bits in the instruction is
needed to identify the register
• In addition to the GPRs, there are many dedicated or
special-purpose registers as well, but many of them are not
“visible” to the programmer
• GPR architecture has explicit operands either in register or
memory thus there may exist:
- Register – memory architecture
- Register – Register (Load/Store) Architecture
- Memory – Memory Architecture
25
General Purpose Register Architecture
Register – Memory Architecture
• One explicit operand is in a
register and one in memory and Processor ....
the result goes into the register R3
• The operand in memory is R2
R1
accessed directly ....
To execute: C=A+B
ADD instruction has explicit
operand A loaded in a register
and the operand B is in memory ALU
and the result is in register
....
26
General Purpose Register Architecture
Register – Register
• The explicit operands in memory
(Load/store) Architecture
are first loaded into registers
Processor
....
temporarily and R3
• Are transferred to memory by R2
R1
Store instruction ....
To execute: C=A+B
ADD instruction has implicit operands
A and B loaded in registers
ALU
Load R1, A
Load R2, B
ADD R3, R1, R2
Store R3, C Memory ....
• Both the explicit operands are not
accessed from memory directly,
i.e., Memory – Memory
Architecture is obsolete
....
27
Basic ISA Classes
Most real machines are hybrids of these:
Accumulator (1 register):
1 address add A acc acc + mem[A]
1+x address addx A acc acc + mem[A + x]
Stack:
0 address add tos tos + next
General Purpose Register (can be memory/memory):
2 address add A B EA[A] EA[A] + EA[B]
3 address add A B C EA[A] [B] + EA[C]
Load/Store:
3 address add Ra Rb Rc Ra Rb + Rc
load Ra Rb Ra mem[Rb]
store Ra Rb mem[Rb] Ra
Comparison:
Bytes per instruction? Number of Instructions? Cycles per instruction?
Comparing Number of Instructions
Code sequence for (C = A + B) for four classes of
instruction sets:
Register Register
Stack Accumulator (register-memory) (load-store)
Push A Load A Load R1,A Load R1,A
Push B Add B Add R1,B Load R2,B
Add Store C Store C, R1 Add R3,R1,R2
Pop C Store C,R3
MIPS
Another example:
Using different instruction formats, write pseudo-code to evaluate the following
expression: Z = 4(A+B) – 16(C+58) : Your code should not change the source
operands
Register-Register
Advantages
• Simple, fixed-length instruction decoding
• Simple code generation
• Similar number of clock cycles / instruction
Disadvantages
• Higher Instruction count than memory reference
• Lower instruction density leads to larger programs
31
Comparison of three GPR Architectures
• Register- Memory
Advantages
• Data can be accessed without separate Load first
• Instruction format is easy to encode
Disadvantages
• Operands are not equivalent since a source
operand (in a register) is destroyed in operation
• Encoding a register number and memory address
in each instruction may restrict the number of
registers
• CPI vary by operand location
32
Comparison of three GPR Architectures
• Memory- Memory
Advantages
• Most compact
• Doesn’t waste registers for temporary storages
Disadvantages
• Large variation in instruction size
• Large variation in work per instruction
• Memory bottleneck by memory access
33
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953)
Data Transfer
- Load, store and
- Move instructions with memory addressing
Control
- Branch, Jump, procedure call and return
36
Categories of Instruction Set Operations …
Cont’d
The following support instructions may be provided in computer
with different levels
System
- Operating System call, Virtual Memory Management
Floating point
- Add, multiply, divide and compare
Decimal
- BCD add, multiply and Decimal to Character Conversion
String
- String move, compare and search
Graphics
- Pixel and vertex operations, compression / de-compression
operations
37
Operand Addressing Modes
• An “effective address” is the binary bit pattern
issued by the CPU to specify the location of
operands in CPU (register) or the memory
39
Commonly used addressing modes …
cont’d
40
Commonly used addressing modes … cont’d
Types of Indirect addressing modes:
Register Indirect
Register Indirect Indexed
- Effective memory address is calculated by adding another register
(index register) to the value in a CPU register (usually referred to as
the base register)
Useful for accessing 2-D arrays
Register Indirect plus displacement
- Similarly, “based” refers to the situation when the constant refers to
the offset (displacement) of an array element with respect to the first
element. The address of the first element is stored in a register
Memory Indirect
41
Commonly used addressing modes … cont’d
Meanings of Indirect Addressing Modes
- Register Indirect
ADD R4, (R1) Reg[R4] Reg[R4] + Mem[Reg[R1]]
- Memory Indirect
ADD R4,@(R1) Reg[R4] Reg[R4] + Mem[Mem[Reg[R1]]
42
Special Addressing Modes
Used for stepping within loops; R2 points to the start of the array; each reference
increments / decrements R2 by ‘d’; the size of the elements in the array
43
Addressing Modes of Control Flow
Instructions
- Branch (conditional)
a sort of displacement, in number of instructions,
relative to PC
- Jump (Unconditional)
jump to an absolute address, independent of the
position of PC
- Procedure call/return
control transfer with some state and return address
saving, some times in a special link register or in some
GPRs (General Purpose Registers)
44
Assembly Variables: Registers (1/4)
45
Assembly Variables: Registers (2/4)
• 32 registers in MIPS
– Why 32? Smaller is faster
46
Assembly Variables: Registers (3/4)
• Number references:
$0, $1, $2, … $30, $31
47
Assembly Variables: Registers (4/4)
• By convention, each register also has a name to make it easier
to code
• For now:
$16 - $22 $s0 - $s7
(correspond to C variables)
$8 - $15 $t0 - $t7
(correspond to temporary variables)
48
MIPS Addressing Formats (Summary)
• How memory can be addressed in MIPS
1. Immediate addressing
op rs rt Immediate
2. Register addressing
op rs rt rd . .. funct Registers
Register
3. Base addressing
op rs rt Address Memor y
4. PC-relative addressing
op rs rt Address Memor y
PC + Word
5. Pseudodirect addressing
op Address Memor y
PC Word
49
Data Transfer: Memory to Reg (1/4)
50
Data Transfer: Memory to Reg (2/4)
• To specify a memory address to copy from, specify two things:
– A register which contains a pointer to memory
– A numerical offset (in bytes)
• Example: 8($t0)
– specifies the memory address pointed to by the value in $t0, plus
8 bytes
51
Data Transfer: Memory to Reg (3/4)
• Load Instruction Syntax:
1 2, 3(4)
– where
1) operation (instruction) name
2) register that will receive value
3) numerical offset in bytes
4) register containing pointer to memory
• Instruction Name:
– lw (meaning Load Word, so 32 bits or one word are loaded at a
time)
52
Data Transfer: Memory to Reg (4/4)
• Example: lw $t0, 12($s0)
This instruction will take the pointer in $s0, add 12 bytes to it, and
then load the value from the memory pointed to by this calculated
sum into register $t0
• Notes:
– $s0 is called the base register
– 12 is called the offset
– offset is generally used in accessing elements of array or structure:
base register points to beginning of array or structure
53
Data Transfer: Reg to Memory
• Also want to store value from a register into memory
• Instruction Name:
54
Addressing: Byte vs. word
• Every word in memory has an address, similar to an index in an array
55
Role of Registers vs. Memory
• What if more variables than registers?
– Compiler tries to keep most frequently used variable in registers
– Writing less frequently used to memory
56
Register
Conventions (1/6)
57
Register
Conventions (2/6)
58
Register
Conventions (3/6)
• $s0-$s7: No Change. Very important, that’s why they’re called saved
registers. If the callee changes these in any way, it must restore the
original values before returning.
• $sp: No Change. The stack pointer must point to the same place before
and after the jal call, or else the caller won’t be able to restore values
from the stack.
• $ra: Change. The jal call itself will change this register.
jal label
– Copies the address of the next instruction into the register $ra (register 31)
and then jumps to the address label.
59
Register
Conventions (4/6)
• What do these conventions mean?
– If function A calls function B, then function A must
save any temporary registers that it may be using onto
the stack before making a jal call.
– Function B must save any S (saved) registers it
intends to use before garbling up their values
– Remember: Caller/callee need to save only temporary
/ saved registers they are using, not all registers.
60
Register
Conventions (5/6)
• Note that, if the callee is going to use some s
registers, it must:
– save those s registers on the stack
– use the registers
– restore s registers from the stack
–jr $ra
61
Other
Registers
• $at: may be used by(6/6)
the assembler at any
time; always unsafe to use
62
63