0% found this document useful (0 votes)
17 views42 pages

04 x86-64 Programming Part One

This lecture covers the fundamentals of x86-64 programming, focusing on the architecture, instruction set, and assembly language. Key concepts include the distinction between architecture and microarchitecture, the types of instruction sets (CISC vs. RISC), and the importance of understanding assembly for performance tuning and systems software. The lecture also discusses data types, registers, memory operations, and basic instruction formats in x86-64 assembly.

Uploaded by

jjulie2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views42 pages

04 x86-64 Programming Part One

This lecture covers the fundamentals of x86-64 programming, focusing on the architecture, instruction set, and assembly language. Key concepts include the distinction between architecture and microarchitecture, the types of instruction sets (CISC vs. RISC), and the importance of understanding assembly for performance tuning and systems software. The lecture also discusses data types, registers, memory operations, and basic instruction formats in x86-64 assembly.

Uploaded by

jjulie2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Lecture 04

x86-64 Programming – part 1

Euhyun Moon, Ph.D.


Machine Learning Systems (MLSys) Lab
Computer Science and Engineering
Sogang University

Slides adapted from Randy Bryant and Dave O'Hallaron: Introduction to Computer Systems, CMU

Sogang University CSE3030:


COR1010: Introduction
AI Programming
to Computer Systems
Architecture Sits at the Hardware Interface
Source code Compiler Architecture Hardware
Different applications Perform optimizations, Instruction set Different
or algorithms generate instructions implementations

Intel Pentium 4
C Language
Intel Core 2
Program
A
GCC x86-64 Intel Core i7

Program AMD Opteron


B
AMD Athlon
Clang

Your
program ARMv8 ARM Cortex-A53
(AArch64/A64)
Apple A7
Sogang University CSE3030: Introduction to Computer Systems 2
Definitions
• Architecture (ISA): The parts of a processor design that
one needs to understand to write assembly code
• “What is directly visible to software”
• The “contract” or “blueprint” between hardware and software

• Microarchitecture: Implementation of the architecture

Sogang University CSE3030: Introduction to Computer Systems 3


Instruction Set Architectures
• The ISA defines:
• The system’s state (e.g., registers, memory, program counter)
• The instructions the CPU can execute
• The effect that each of these instructions will have on the system state

CPU

PC Memory
Registers

Sogang University CSE3030: Introduction to Computer Systems 4


Instruction Set Philosophies
• Complex Instruction Set Computing (CISC):
Add more and more elaborate and specialized instructions as
needed
• Lots of tools for programmers to use, but hardware must be able to
handle all instructions
• x86-64 is CISC, but only a small subset of instructions encountered
with Linux programs

• Reduced Instruction Set Computing (RISC):


Keep instruction set small and regular
• Easier to build fast hardware
• Let software do the complicated operations by composing simpler ones

Sogang University CSE3030: Introduction to Computer Systems 5


General ISA Design Decisions
• Instructions
• What instructions are available? What do they do?
• How are they encoded?

• Registers
• How many registers are there?
• How wide are they?

• Memory
• How do you specify a memory location?

Sogang University CSE3030: Introduction to Computer Systems 6


Mainstream ISAs

Macbooks & PCs Smartphone-like devices Mostly research


(Core i3, i5, i7, M) (iPhone, iPad, Raspberry Pi) (some traction in embedded)
x86-64 Instruction Set ARM Instruction Set RISC-V Instruction Set

Sogang University CSE3030: Introduction to Computer Systems 7


Writing Assembly Code? In 2025?
• Chances are, you’ll never write a program in assembly, but
understanding assembly is the key to the machine-level
execution model:
• Behavior of programs in the presence of bugs
• When high-level language model breaks down

• Tuning program performance


• Understand optimizations done/not done by the compiler
• Understanding sources of program inefficiency

• Implementing systems software


• What are the “states” of processes that the OS must manage
• Using special units (timers, I/O co-processors, etc.) inside processor!

• Fighting malicious software


• Distributed software is in binary form

Sogang University CSE3030: Introduction to Computer Systems 8


Assembly Programmer’s View
CPU Addresses
Memory
Registers • Code
PC Data
• Data
Condition Instructions • Stack
Codes

• Programmer-visible state
• PC: the Program Counter (%rip in x86-64)
• Address of next instruction
• Named registers
• Memory
• Byte-addressable array
• Together in “register file”
• Code and user data
• Heavily used program data
• Includes the Stack (for
• Condition codes supporting procedures)
• Store status information about most recent
arithmetic operation
• Used for conditional branching

Sogang University CSE3030: Introduction to Computer Systems 9


x86-64 Assembly “Data Types”
• Integral data of 1, 2, 4, or 8 bytes
• Data values
• Addresses
• Floating point data of 4, 8, 10 or 2x8 or 4x4 or 8x2
Not covered
• Different registers for those (e.g., %xmm1, %ymm2) in CSE3030
• Come from extensions to x86 (SSE, AVX, …)
• No aggregate types such as arrays or structures
• Just contiguously allocated bytes in memory
• Two common syntaxes
• “AT&T”: used by our course, slides, textbook, gnu tools, …
• “Intel”: used by Intel documentation, Intel tools, …
• Must know which you’re reading

Sogang University CSE3030: Introduction to Computer Systems 10


What is a Register?
• A location in the CPU that stores a small amount of data,
which can be accessed very quickly (once every clock cycle)

• Registers have names, not addresses


• In assembly, they start with % (e.g., %rsi)

• Registers are at the heart of assembly programming


• They are a precious commodity in all architectures, but especially
x86

Sogang University CSE3030: Introduction to Computer Systems 11


x86-64 Integer Registers – 64 bits wide

%rax %eax %r8 %r8d

%rbx %ebx %r9 %r9d

%rcx %ecx %r10 %r10d

%rdx %edx %r11 %r11d

%rsi %esi %r12 %r12d

%rdi %edi %r13 %r13d

%rsp %esp %r14 %r14d

%rbp %ebp %r15 %r15d

• Can reference low-order 4 bytes (also low-order 2 & 1 bytes)

Sogang University CSE3030: Introduction to Computer Systems 12


Some History: IA32 Registers – 32 bits wide

%eax %ax %ah %al accumulate

%ecx %cx %ch %cl counter


general purpose

%edx %dx %dh %dl data

%ebx %bx %bh %bl base

%esi %si source index

%edi %di destination index

%esp %sp stack pointer

%ebp %bp base pointer

16-bit virtual registers Name Origin


(backwards compatibility) (mostly obsolete)
Sogang University CSE3030: Introduction to Computer Systems 13
Memory vs. Register
Memory Register

• Addresses vs. Names


• 0x7FFFD024C3DC %rdi

• Big vs. Small


• ~ 8 GiB (16 x 8 B) = 128 B
• Slow vs. Fast
• ~50-100 ns sub-nanosecond timescale

• Dynamic vs. Static


• Can “grow” as needed fixed number in hardware
while program runs

Sogang University CSE3030: Introduction to Computer Systems 14


Three Basic Kinds of Instructions
1) Transfer data between memory and register
• Load data from memory into register
• %reg = Mem[address] Remember: Memory
• Store register data into memory is indexed just like an
array of bytes!
• Mem[address] = %reg

2) Perform arithmetic operation on register or memory data


• c = a + b; z = x << y; i = h & g;

3) Control flow: what instruction to execute next


• Unconditional jumps to/from procedures
• Conditional branches

Sogang University CSE3030: Introduction to Computer Systems 15


Instruction Sizes
• Size specifiers
• b = 1-byte “byte” w = 2-byte “word”
l = 4-byte “long word” q = 8-byte “quad word”

• Note that due to backwards-compatible support for 8086


programs (16-bit machines!), “word” means 16 bits = 2
bytes in x86 instruction names

Sogang University CSE3030: Introduction to Computer Systems 16


Operand Types
• Immediate: Constant integer data ($)
• Examples: $0x400, $-533 %rax
• Like C literal, but prefixed with ‘$’ %rcx
• Encoded with 1, 2, 4, or 8 bytes
depending on the instruction
%rdx
• Register: 1 of 16 integer registers (%) %rbx
• Examples: %rax, %r13 %rsi
• But %rsp reserved for special use %rdi
• Others have special uses for particular
instructions %rsp
• Memory: Consecutive bytes of memory %rbp
at a computed address (())
• Simplest example: (%rax) %rN
• Various other “address modes”

Sogang University CSE3030: Introduction to Computer Systems 17


x86-64 Introduction
• Data transfer instruction (mov)
• Arithmetic operations
• Memory addressing modes
• swap example

Sogang University CSE3030: Introduction to Computer Systems 18


Moving Data
• General form: mov_ source, destination
• Missing letter (_) specifies size of operands
• Note that due to backwards-compatible support for 8086 programs
(16-bit machines!), “word” means 16 bits = 2 bytes in x86 instruction
names
• Lots of these in typical code

• movb src, dst • movl src, dst


• Move 1-byte “byte” • Move 4-byte “long word”
• movw src, dst • movq src, dst
• Move 2-byte “word” • Move 8-byte “quad word”

Sogang University CSE3030: Introduction to Computer Systems 19


Operand Combinations

Source Dest Src, Dest C Analog

Reg movq $0x4, %rax var_a = 0x4;


Imm
Mem movq $-147, (%rax) *p_a = -147;

movq Reg movq %rax, %rdx var_d = var_a;


Reg
Mem movq %rax, (%rdx) *p_d = var_a;

Mem Reg movq (%rax), %rdx var_d = *p_a;

• Cannot do memory-memory transfer with a single instruction


• How would you do it?

Sogang University CSE3030: Introduction to Computer Systems 20


Some Arithmetic Operations
• Binary (two-operand) Instructions:

Maximum of one Format Computation
memory operand addq src, dst dst = dst + src (dst += src)

• Beware argument subq src, dst dst = dst – src


order! imulq src, dst dst = dst * src signed mult
• No distinction sarq src, dst dst = dst >> src Arithmetic
between signed shrq src, dst dst = dst >> src Logical
and unsigned
• Only arithmetic vs. shlq src, dst dst = dst << src (same as salq)
logical shifts xorq src, dst dst = dst ^ src
andq src, dst dst = dst & src
orq src, dst dst = dst | src
operand size specifier

• How do you implement “rcx = rax + rbx”?

Sogang University CSE3030: Introduction to Computer Systems 21


Some Arithmetic Operations
• Unary (one-operand) Instructions:

Format Computation
incq dst dst = dst + 1 increment
decq dst dst = dst – 1 decrement
negq dst dst = –dst negate
notq dst dst = ~dst bitwise complement

• See CS:APP3e textbook Section 3.5.5 for more instructions:


mulq, cqto, idivq, divq

Sogang University CSE3030: Introduction to Computer Systems 22


Arithmetic Example Register Use(s)
%rdi 1st argument (x)
%rsi 2nd argument (y)
long simple_arith(long x, long y)
{ %rax return value
long t1 = x + y;
long t2 = t1 * 3;
return t2;
}
y += x;
y *= 3;
long r = y;
return r;

simple_arith:
addq %rdi, %rsi
imulq $3, %rsi
movq %rsi, %rax
ret

Sogang University CSE3030: Introduction to Computer Systems 23


Example of Basic Addressing Modes

void swap(long *xp, long *yp)


{
long t0 = *xp;
long t1 = *yp;
*xp = t1;
*yp = t0;
}
Compiler Explorer:
https://godbolt.org/z/zc4Pcq
swap:
movq (%rdi), %rax
movq (%rsi), %rdx
movq %rdx, (%rdi)
movq %rax, (%rsi)
ret

Sogang University CSE3030: Introduction to Computer Systems 24


Understanding swap()
void swap(long *xp, long *yp) Registers Memory
{
%rdi
long t0 = *xp;
long t1 = *yp; %rsi
*xp = t1;
%rax
*yp = t0;
} %rdx

swap: Register Variable


movq (%rdi), %rax %rdi ⇔ xp
movq (%rsi), %rdx
%rsi ⇔ yp
movq %rdx, (%rdi)
movq %rax, (%rsi) %rax ⇔ t0
ret %rdx ⇔ t1

Sogang University CSE3030: Introduction to Computer Systems 25


Understanding swap()
Registers Memory Word
Address
%rdi 0x120 123 123 0x120
%rsi 0x100 0x118
0x110
%rax
0x108
%rdx
456 0x100

swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret

Sogang University CSE3030: Introduction to Computer Systems 26


Understanding swap()
Registers Memory Word
Address
%rdi 0x120 123 123 0x120
%rsi 0x100 0x118
0x110
%rax 123
0x108
%rdx
456 0x100

swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret

Sogang University CSE3030: Introduction to Computer Systems 27


Understanding swap()
Registers Memory Word
Address
%rdi 0x120 123 123 0x120
%rsi 0x100 0x118
0x110
%rax 123
0x108
%rdx 456
456 0x100

swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret

Sogang University CSE3030: Introduction to Computer Systems 28


Understanding swap()
Registers Memory Word
Address
%rdi 0x120 123 456 0x120
%rsi 0x100 0x118
0x110
%rax 123
0x108
%rdx 456
456 0x100

swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret

Sogang University CSE3030: Introduction to Computer Systems 29


Understanding swap()
Registers Memory Word
Address
%rdi 0x120 123 456 0x120
%rsi 0x100 0x118
0x110
%rax 123
0x108
%rdx 456
123 0x100

swap:
movq (%rdi), %rax # t0 = *xp
movq (%rsi), %rdx # t1 = *yp
movq %rdx, (%rdi) # *xp = t1
movq %rax, (%rsi) # *yp = t0
ret

Sogang University CSE3030: Introduction to Computer Systems 30


Memory Addressing Modes: Basic
• Indirect: (R) Mem[Reg[R]]
• Data in register R specifies the memory address
• Like pointer dereference in C
• Example: movq (%rcx), %rax

• Displacement: D(R) Mem[Reg[R]+D]


• Data in register R specifies the start of some memory region
• Constant displacement D specifies the offset from that address
• Example: movq 8(%rbp), %rdx

Sogang University CSE3030: Introduction to Computer Systems 31


Complete Memory Addressing Modes
• General:
• D(Rb,Ri,S) Mem[Reg[Rb]+Reg[Ri]*S+D]
• Rb: Base register (any register)
• Ri: Index register (any register except %rsp)
• S: Scale factor (1, 2, 4, 8) – why these numbers?
• D: Constant displacement value (a.k.a. immediate)

• Special cases (see CS:APP3e Figure 3.3 on p.181)


• D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D] (S=1)
• (Rb,Ri,S) Mem[Reg[Rb]+Reg[Ri]*S] (D=0)
• (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]] (S=1,D=0)
• (,Ri,S) Mem[Reg[Ri]*S] (Rb=0,D=0)

Sogang University CSE3030: Introduction to Computer Systems 32


Address Computation Examples

%rdx 0xf000 D(Rb,Ri,S) →


%rcx 0x0100 Mem[Reg[Rb]+Reg[Ri]*S+D]

Expression Address Computation Address


0x8(%rdx)
(%rdx,%rcx)
(%rdx,%rcx,4)
0x80(,%rdx,2)

Sogang University CSE3030: Introduction to Computer Systems 33


Address Computation Instruction
• leaq src, dst
• "lea" stands for load effective address
• src is address expression (any of the formats we’ve seen)
• dst is a register
• Sets dst to the address computed by the src expression
(does not go to memory! – it just does math)
• Example: leaq (%rdx,%rcx,4), %rax
• Uses:
• Computing addresses without a memory reference
• e.g., translation of p = &x[i];
• Computing arithmetic expressions of the form x+k*i+d
• Though k can only be 1, 2, 4, or 8

Sogang University CSE3030: Introduction to Computer Systems 34


Example: lea vs. mov
Registers Memory Word
Address
%rax 123
0x400 0x120
%rbx 0xF 0x118
0x8 0x110
%rcx 0x4
0x10 0x108
%rdx 0x100
0x1 0x100
%rdi

%rsi

leaq (%rdx,%rcx,4), %rax


movq (%rdx,%rcx,4), %rbx
leaq (%rdx), %rdi
movq (%rdx), %rsi

Sogang University CSE3030: Introduction to Computer Systems 35


lea

lea – “It just does math”

Sogang University CSE3030: Introduction to Computer Systems 36


Arithmetic Example Register Use(s)
%rdi 1st argument (x)
long arith(long x, long y, long z) %rsi 2nd argument (y)
{ %rdx 3rd argument (z)
long t1 = x + y;
long t2 = z + t1;
long t3 = x + 4;
long t4 = y * 48;
long t5 = t3 + t4;
long rval = t2 * t5;
return rval;
}

arith:
leaq (%rdi,%rsi), %rax • Interesting Instructions
addq %rdx, %rax • leaq: “address” computation
leaq (%rsi,%rsi,2), %rdx • salq: shift
salq $4, %rdx • imulq: multiplication
leaq 4(%rdi,%rdx), %rcx
• Only used once!
imulq %rcx, %rax
ret
Sogang University CSE3030: Introduction to Computer Systems 37
Arithmetic Example Register Use(s)
%rdi x
%rsi y
long arith(long x, long y, long z)
{ %rdx z, t4
long t1 = x + y; %rax t1, t2, rval
long t2 = z + t1;
long t3 = x + 4; %rcx t5
long t4 = y * 48;
long t5 = t3 + t4;
long rval = t2 * t5;
return rval;
}

arith:
leaq (%rdi,%rsi), %rax # rax/t1 = x + y
addq %rdx, %rax # rax/t2 = t1 + z
leaq (%rsi,%rsi,2), %rdx # rdx = 3 * y
salq $4, %rdx # rdx/t4 = (3*y) * 16
leaq 4(%rdi,%rdx), %rcx # rcx/t5 = x + t4 + 4
imulq %rcx, %rax # rax/rval = t5 * t2
ret
Sogang University CSE3030: Introduction to Computer Systems 38
Control Flow Register Use(s)
%rdi 1st argument (x)
%rsi 2nd argument (y)
%rax return value

long max(long x, long y)


{
long max; max:
if (x > y) { ???
max = x; movq %rdi, %rax
} else { ???
max = y; ???
} movq %rsi, %rax
return max; ???
} ret

Sogang University CSE3030: Introduction to Computer Systems 39


Control Flow Register Use(s)
%rdi 1st argument (x)
%rsi 2nd argument (y)
%rax return value

long max(long x, long y)


{
long max; max:
if (x > y) { Conditional jump if x <= y then jump to else
max = x; movq %rdi, %rax
} else { Unconditional jump jump to done
max = y; else:
} movq %rsi, %rax
return max; done:
} ret

Sogang University CSE3030: Introduction to Computer Systems 40


Conditionals and Control Flow
• Conditional branch/jump
• Jump to somewhere else if some condition is true,
otherwise execute next instruction
• Unconditional branch/jump
• Always jump when you get to this instruction

• Together, they can implement most control flow constructs in


high-level languages:
• if (condition) then {…} else {…}
• while (condition) {…}
• do {…} while (condition)
• for (initialization; condition; iterative) {…}
• switch {…}

Sogang University CSE3030: Introduction to Computer Systems 41


Summary
• x86-64 is a complex instruction set computing (CISC)
architecture
• There are 3 types of operands in x86-64
• Immediate, Register, Memory
• There are 3 types of instructions in x86-64
• Data transfer, Arithmetic, Control Flow

• Memory Addressing Modes: The addresses used for


accessing memory in mov (and other) instructions can be
computed in several different ways
• Base register, index register, scale factor, and displacement map
well to pointer arithmetic operations

• Control flow in x86 determined by Condition Codes

Sogang University CSE3030: Introduction to Computer Systems 42

You might also like