5 - Tools & x86-64 Assembly - Study Guide
For COMP201: Computer Systems & Programming at Koç University
GCC Compilation Pipeline
Let me walk you through what happens when you type gcc hello.c -o hello. Your code goes through a 4-stage pipeline: preprocessing, compilation, assembly, and linking. Let's look at each one.
Stage 1: Preprocessing
What it does: Expands #include directives, processes #define macros, removes comments.
Flag: gcc -E
Output: .i file (preprocessed C code, still human-readable but expanded)
gcc -E hello.c -o hello.i
If you open hello.i, you'll see your original source code, but every #include <stdio.h> has been replaced with the entire contents of that header file, and every macro has been expanded.
Key insight: The preprocessor is very dumb - it's just text manipulation. It doesn't understand C at all.
Stage 2: Compilation
What it does: Converts C code into assembly language (human-readable machine instructions).
Flag: gcc -S
Output: .s file (assembly code)
gcc -S hello.c -o hello.s
Now you have assembly code! This is where the compiler (the real translator) does the hard work. It understands your C semantics and figures out how to express them as CPU instructions.
Key insight: This is the stage you're learning about right now. Assembly is the interface between humans and the CPU.
Stage 3: Assembly
What it does: Converts assembly code into machine code (binary instructions the CPU understands).
Flag: gcc -c
Output: .o file (object file - binary, not human-readable)
gcc -c hello.c -o hello.o
# or from the .s file:
gcc -c hello.s -o hello.o
The assembler reads the human-readable assembly and outputs raw binary. But notice: you don't have an executable yet! Object files are "incomplete" - they contain references to external functions that need to be resolved.
Key insight: A .o file is object code, not executable. It's the "middle ground" between assembly and executables.
Stage 4: Linking
What it does: Combines object files + libraries into a single executable.
Flag: gcc (default, final executable)
Output: executable (or .exe on Windows)
gcc hello.c -o hello # All 4 stages in one command
gcc hello.o -o hello # Just linking (stage 4)
The linker (ld) takes your object files and the system libraries (like libc for printf) and glues them together. It resolves all the "TODO" references ("I need printf from somewhere!") by finding them in the libraries.
Key insight: Without linking, you have machine code islands floating around. The linker is the glue that holds it all together.
Full Pipeline Visualization
hello.c
↓ [gcc -E] Preprocessor (cpp)
hello.i (expanded C)
↓ [gcc -S] Compiler (cc1)
hello.s (assembly)
↓ [gcc -c] Assembler (as)
hello.o (object file, binary)
↓ [ld] Linker
hello (executable)
Common Compilation Flags
# Stop at each stage
gcc -E hello.c -o hello.i # Preprocess only
gcc -S hello.c -o hello.s # Stop after assembly
gcc -c hello.c -o hello.o # Stop after assembling
# Full compilation
gcc hello.c -o hello # Preprocess, compile, assemble, link
# Useful flags
gcc -g hello.c -o hello # Include debug symbols (for gdb)
gcc -O2 hello.c -o hello # Optimization level 2
gcc -Wall hello.c -o hello # Warn about all common mistakes
gcc -Wall -g -O2 hello.c # All of the above
Why should you care? When you see an error message like "undefined reference to printf", you know it's a linking error (stage 4). When you see a syntax error, it's a compilation error (stage 2). Understanding the pipeline helps you debug!
Make and Makefiles
What is Make?
Make is a build automation tool. Instead of typing the same gcc commands over and over, you write them once in a Makefile, and make executes them for you. Better yet, make only recompiles files that have changed - a huge time-saver for large projects.
Makefile Structure
A Makefile is a series of rules. Each rule specifies:
- A target (what you want to build)
- Dependencies (what files the target depends on)
- Commands (how to build it)
target: dependency1 dependency2 ...
command_to_build_target
another_command
CRITICAL: The indentation before commands MUST be a TAB character, not spaces! This is a common mistake.
Simple Example
# Variables (optional but recommended)
CC = gcc
CFLAGS = -Wall -g -O2
# Build the executable 'myprogram' from myprogram.c
myprogram: myprogram.c
$(CC) $(CFLAGS) myprogram.c -o myprogram
# Delete all build artifacts
clean:
rm -f myprogram
# .PHONY tells make these targets don't represent actual files
.PHONY: clean
You'd use it like:
make # Build myprogram (if myprogram.c is newer than myprogram)
make clean # Delete myprogram
make myprogram # Force rebuild of myprogram
Multi-File Project Example
CC = gcc
CFLAGS = -Wall -g
# The main target depends on multiple object files
myprogram: main.o utils.o helpers.o
$(CC) $(CFLAGS) main.o utils.o helpers.o -o myprogram
# Pattern rule: any .o file comes from a .c file with the same name
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
# Clean up
clean:
rm -f *.o myprogram
.PHONY: clean
Key variables:
$<= first dependency$@= target$(CC)= the compiler command$(CFLAGS)= compiler flags
Another Practical Example
# For the project with main.c, utils.c, and a utils.h header
CC = gcc
CFLAGS = -Wall -Wextra -g
LIBS = -lm # link math library
SOURCES = main.c utils.c
OBJECTS = $(SOURCES:.c=.o)
TARGET = myprogram
all: $(TARGET)
$(TARGET): $(OBJECTS)
$(CC) $(CFLAGS) $(OBJECTS) -o $(TARGET) $(LIBS)
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
clean:
rm -f $(OBJECTS) $(TARGET)
.PHONY: all clean
Why Make Matters
Imagine you're working on a project with 50 source files. Without Make:
gcc -Wall -g file1.c file2.c file3.c ... file50.c -o myprogram
Every time you change one file, you wait for all 50 to recompile. With Make:
make
Make knows which files changed and only recompiles those + anything that depends on them. For a large project, this can mean the difference between a 2-minute wait and a 10-second rebuild.
x86-64 Architecture Overview
Why Assembly?
Assembly is the human-readable form of machine code. CPUs understand only 1s and 0s, but we can write instructions like mov %rax, %rbx and an assembler converts them to binary.
x86-64 is the 64-bit version of the x86 instruction set, the dominant architecture on PCs and servers. Here's a quick history:
- 8086 (1978): 16-bit registers, started it all
- 80386 (1985): Extended to 32-bit (IA-32)
- x86-64 (2003): Extended to 64-bit, what we use today
Registers: Your Fastest Storage
A register is a tiny, super-fast piece of memory built into the CPU. x86-64 has 16 general-purpose 64-bit registers. Think of them as your CPU's "scratchpad" - variables you're actively using live here.
Each register can be accessed at different sizes:
- 64-bit:
%rax,%rbx, etc. - 32-bit:
%eax,%ebx, etc. (lower 32 bits) - 16-bit:
%ax,%bx, etc. (lower 16 bits) - 8-bit:
%al,%bl, etc. (lowest 8 bits)
Golden rule: When you write to a 32-bit register (%eax), it zero-extends the upper 32 bits automatically. When you write to 8-bit or 16-bit, the upper bits are NOT cleared.
x86-64 Registers (CRITICAL)
You must memorize this table for the exam.
| 64-bit | 32-bit | 16-bit | 8-bit | Special Purpose |
|---|---|---|---|---|
| %rax | %eax | %ax | %al | Return value, accumulator |
| %rbx | %ebx | %bx | %bl | Callee-saved |
| %rcx | %ecx | %cx | %cl | 4th argument, loop counter |
| %rdx | %edx | %dx | %dl | 3rd argument, I/O |
| %rsi | %esi | %si | %sil | 2nd argument, source |
| %rdi | %edi | %di | %dil | 1st argument, destination |
| %rsp | %esp | %sp | %spl | Stack pointer |
| %rbp | %ebp | %bp | %bpl | Base pointer (frame) |
| %r8 | %r8d | %r8w | %r8b | 5th argument |
| %r9 | %r9d | %r9w | %r9b | 6th argument |
| %r10 | %r10d | %r10w | %r10b | Caller-saved |
| %r11 | %r11d | %r11w | %r11b | Caller-saved |
| %r12 | %r12d | %r12w | %r12b | Callee-saved |
| %r13 | %r13d | %r13w | %r13b | Callee-saved |
| %r14 | %r14d | %r14w | %r14b | Callee-saved |
| %r15 | %r15d | %r15w | %r15b | Callee-saved |
Register Naming Pattern
Look at %rax:
%rax= 64-bit (quadword)%eax= 32-bit (doubleword) - the 'e' is for "extended"%ax= 16-bit (word)%al= 8-bit (byte) - the 'l' is for "low byte"
The pattern holds for the first 8 registers. For %r8–%r15, you add suffixes: b, w, d, or no suffix for 64-bit.
Critical Register Facts
- %rax is special: Function return values go here. Very common in reverse engineering.
- %rsp is sacred: This is the stack pointer. The stack grows downward in memory (toward lower addresses). Never mess with %rsp unless you know what you're doing.
- Caller-saved vs. Callee-saved: Some registers you must preserve if your function uses them (callee-saved: %rbx, %r12–%r15, %rsp, %rbp). Others the caller must save if it wants to keep their values (caller-saved: %rax, %rcx, %rdx, %rsi, %rdi, %r8–%r11).
- Arguments go in registers: The first 6 integer arguments to a function go in %rdi, %rsi, %rdx, %rcx, %r8, %r9 (in that order). The 7th argument and beyond go on the stack.
Data Sizes and Instruction Suffixes
In assembly, we use suffixes on instructions to specify data size. This is the x86-64 convention (AT&T syntax, what GCC uses).
| Suffix | Data Size | C Type | Example |
|---|---|---|---|
b |
1 byte | char |
movb $0x42, %al |
w |
2 bytes | short |
movw $0x1234, %ax |
l |
4 bytes | int |
movl $0xDEADBEEF, %eax |
q |
8 bytes | long, pointer |
movq $0x123456789, %rax |
Rule of thumb: Match the suffix to the register size you're using.
movb %al, %bl # Move 1 byte
movw %ax, %bx # Move 2 bytes
movl %eax, %ebx # Move 4 bytes (32-bit, zero-extends)
movq %rax, %rbx # Move 8 bytes (full 64-bit)
The mov Instruction
Syntax and Semantics
The mov instruction copies data from a source to a destination:
mov src, dst
This is AT&T syntax (what GCC outputs). Note: source comes first, destination comes second. This is opposite to Intel syntax (where you'd write mov dst, src). Be careful!
Semantics: The instruction reads from src and writes to dst. The original src is unchanged (it's a copy, not a move).
movl $42, %eax # %eax now contains 42
movl $42, %eax # Again - copies 42 to %eax
movl %eax, %ebx # %eax → %ebx, %eax unchanged
movl (%rax), %ebx # Read from memory at address %rax, store in %ebx
Operand Types
Each operand can be one of three types:
1. Immediate (Literal Value)
A constant value, written with a $ prefix.
movl $10, %eax # Move literal 10 into %eax
movl $0xFF, %ebx # Move literal 255 into %ebx
movq $0x1000, %rax # Move 64-bit address into %rax
Restriction: Only the source can be immediate (can't do mov %eax, $10).
2. Register
A register name.
movl %eax, %ebx # Copy %eax to %ebx
movq %r8, %rax # Copy %r8 to %rax
movw %cx, %dx # Copy %cx to %dx (16-bit)
3. Memory
An address in memory. Written with parentheses or offsets.
movl (%rax), %ebx # Load from address %rax into %ebx
movl $42, (%rax) # Store literal 42 to address %rax
movl 8(%rax), %ebx # Load from address (%rax + 8) into %ebx
Critical Restriction: No Memory-to-Memory
You cannot move directly from memory to memory. This is illegal:
movl (%rax), (%rbx) # WRONG! Illegal in x86-64
You must use a register as an intermediary:
movl (%rax), %ecx # Load from memory
movl %ecx, (%rbx) # Store to memory
mov Operand Forms and Addressing Modes
The 5 Addressing Modes
The CPU provides flexible ways to calculate memory addresses. Each mode is useful for different situations (array access, struct fields, pointer dereference, etc.).
| Mode | Syntax | Address | Meaning |
|---|---|---|---|
| Direct | addr |
addr | Fixed address (rare) |
| Register Indirect | (%r) |
R[r] | Address is in register r |
| Base + Displacement | D(%r) |
R[r] + D | Register + constant offset |
| Indexed | (%r₁, %r₂) |
R[r₁] + R[r₂] | Sum of two registers |
| Indexed with Offset | D(%r₁, %r₂) |
R[r₁] + R[r₂] + D | Two regs + constant |
| Scaled Index | (%r, %i, s) |
R[r] + s·R[i] | Base + scaled index |
| Full Form | D(%r, %i, s) |
R[r] + s·R[i] + D | All combined |
General formula:
Effective Address = Displacement + BaseRegister + Scale × IndexRegister
```text
Where:
- **Displacement (D):** A constant (0–2 billion, typically small)
- **Base register (Rb):** Any general-purpose register
- **Index register (Ri):** Any register EXCEPT %rsp
- **Scale (S):** 1, 2, 4, or 8 (multiplies the index)
### Practical Examples
#### Array Access
```c
int arr[10];
int x = arr[3]; // Access arr[3]
If arr is in %rdi and the index (3) is in %rsi:
movl (%rdi, %rsi, 4), %eax # Effective address: %rdi + 4*%rsi
# For index 3: %rdi + 4*3 = %rdi + 12
# (int is 4 bytes, so index 3 = byte offset 12)
Struct Field Access
struct Point { int x; int y; };
Point p;
int val = p.y; // Second field (offset 4 bytes from start)
If the struct pointer is in %rax:
movl 4(%rax), %eax # Load from address %rax + 4 (the 'y' field)
Pointer Dereference
int *ptr = ...;
int x = *ptr; // Dereference the pointer
movl (%rdi), %eax # Load from address stored in %rdi
The lea Instruction
What lea Does
lea stands for "Load Effective Address". It's different from mov in one crucial way:
lea DOES NOT read from or write to memory. It computes an address and stores the address in a register.
lea addr, %reg # Store the address in %reg (don't read from addr)
Compare:
movq (%rax), %rbx # Read value at address %rax, store in %rbx
leaq (%rax), %rbx # Compute address %rax, store in %rbx
The first reads memory; the second doesn't.
Why This Matters for Exam
The exam loves lea because it's often used for clever arithmetic. Since lea computes addresses, you can use it to do math:
leaq (%rdi, %rdi, 2), %rax # %rax = %rdi + 2*%rdi = 3*%rdi
This computes 3*x in a single instruction! Without lea, you'd need multiple instructions:
# Without lea (3 instructions):
movq %rdi, %rax
addq %rdi, %rax # %rax = 2*%rdi
addq %rdi, %rax # %rax = 3*%rdi
Common Patterns
Compilers use lea for multiplications by small constants:
# 3*x in %rdi, result in %rax:
leaq (%rdi, %rdi, 2), %rax
# 5*x:
leaq (%rdi, %rdi, 4), %rax
# 9*x:
leaq (%rdi, %rdi, 8), %rax
# x*2 + 7 (address calc):
leaq 7(%rdi, %rdi, 1), %rax
When You See lea
If you see lea in assembly, ask: "Is this computing an address for memory access, or doing arithmetic?" Usually it's arithmetic. The instruction computes the address but doesn't access memory, making it fast and efficient.
Arithmetic and Logical Instructions
Two-Operand Arithmetic Instructions
These instructions follow the pattern: op src, dst → dst = dst OP src
| Instruction | Operation | Syntax | Example | C Equivalent |
|---|---|---|---|---|
add |
Addition | add S, D |
addl %ecx, %eax |
eax += ecx |
sub |
Subtraction | sub S, D |
subl $8, %rsp |
rsp -= 8 |
imul |
Signed Multiply | imul S, D |
imulq %rbx, %rax |
rax *= rbx |
and |
Bitwise AND | and S, D |
andl $0xFF, %eax |
eax &= 0xFF |
or |
Bitwise OR | or S, D |
orl %ecx, %eax |
eax |= ecx |
xor |
Bitwise XOR | xor S, D |
xorl %eax, %eax |
eax ^= eax |
sal / shl |
Left Shift | sal k, D |
sall $2, %eax |
eax <<= 2 |
sar |
Arithmetic Right Shift | sar k, D |
sarl $1, %eax |
eax >>= 1 |
shr |
Logical Right Shift | shr k, D |
shrl $1, %eax |
eax >>= 1 |
Note: For shifts, the amount can be:
- An 8-bit register (usually
%cl) - An immediate (0–255)
One-Operand Instructions
| Instruction | Operation | Syntax | Example |
|---|---|---|---|
inc |
Increment | inc D |
incl %eax |
dec |
Decrement | dec D |
decl %eax |
neg |
Negate (two's complement) | neg D |
negl %eax |
not |
Bitwise NOT | not D |
notl %eax |
Clever Tricks
Zero a register using xor:
xorl %eax, %eax # XOR with itself = 0, faster than movl $0
This is idiomatic x86 assembly. Any time you see xorl %reg, %reg, it means "zero this register."
Multiply by powers of 2 using shifts:
sall $2, %eax # Left shift by 2 = multiply by 4 (2^2)
shll $3, %eax # Left shift by 3 = multiply by 8 (2^3)
Shifts are faster than multiplication.
Divide by 2 using arithmetic right shift:
sarl $1, %eax # Arithmetic right shift by 1 = divide by 2 (signed)
How Suffixes Work with Arithmetic
Just like mov, arithmetic instructions have size suffixes:
addl %ecx, %eax # Add 32-bit values
addq %rcx, %rax # Add 64-bit values
addw %cx, %ax # Add 16-bit values
addb %cl, %al # Add 8-bit values
Reverse Engineering Assembly
This is the key exam skill. You'll be given assembly and asked: "What C code generated this?" Let's practice.
Strategy for Reverse Engineering
- Identify the function: Look for the function label and the
retat the end. - Track arguments: Where do %rdi, %rsi, %rdx go? These are the first three arguments.
- Watch return value: What ends up in %rax before
ret? - Trace operations: Follow each instruction to understand what the code does.
- Map back to C: Reconstruct the C code that would produce this assembly.
Example 1: Simple Arithmetic
Assembly:
simple:
leal (%rdi, %rdi, 2), %eax # %eax = 3 * %edi
addl %esi, %eax # %eax += %esi
ret
Step-by-step:
- First argument is in %rdi (call it
x) - Second argument is in %rsi (call it
y) leal (%rdi, %rdi, 2), %eax: Compute %rdi + 2*%rdi = 3*x, store in %eaxaddl %esi, %eax: Add second argument: %eax += yret: Return %eax
C code:
int simple(int x, int y) {
return 3 * x + y;
}
Example 2: Array Access
Assembly:
get_array_elem:
movl (%rdi, %rsi, 4), %eax # Load arr[index]
ret
Step-by-step:
- First argument is in %rdi (array pointer, call it
arr) - Second argument is in %rsi (index, call it
i) movl (%rdi, %rsi, 4), %eax: Load from address %rdi + 4*%rsi into %eax- This is array indexing! Effective address = base + scale*index
- Scale is 4 because we're dealing with
int(4 bytes)
ret: Return %eax
C code:
int get_array_elem(int arr[], int i) {
return arr[i]; // or: *(arr + i)
}
Example 3: Conditional Arithmetic
Assembly:
conditional_op:
movl %edi, %eax # %eax = x
cmpl %esi, %edi # Compare x with y
jle .L1 # Jump if x <= y
imull %esi, %eax # %eax *= y (if x > y)
jmp .L2
.L1:
addl %esi, %eax # %eax += y (if x <= y)
.L2:
ret
Step-by-step:
- First arg in %rdi (x), second in %rsi (y)
- Copy x into %eax (default return value)
- Compare x with y
- If x <= y, jump to .L1 (add path)
- Otherwise, multiply by y and jump to .L2
- At .L1, add y instead
- Return %eax
C code:
int conditional_op(int x, int y) {
if (x > y) {
return x * y;
} else {
return x + y;
}
}
Calling Conventions (Brief)
Understanding how functions call other functions is crucial for reading assembly. Here are the essentials:
Argument Passing (x86-64 System V ABI)
Integer arguments (the first 6):
- 1st argument: %rdi
- 2nd argument: %rsi
- 3rd argument: %rdx
- 4th argument: %rcx
- 5th argument: %r8
- 6th argument: %r9
Arguments 7+ go on the stack.
Return Value
- Integer return value: In %rax (or %edx:%eax for 128-bit on some systems)
- Floating-point return value: In %xmm0
Register Preservation
Caller must save these if they need to preserve them after a call:
- %rax, %rcx, %rdx, %rsi, %rdi, %r8–%r11 (caller-saved)
Functions must preserve these:
- %rbx, %r12–%r15, %rbp, %rsp (callee-saved)
This means if your function uses %rbx, you must save it on entry and restore it on exit.
Practice Problems
Problem 1: GCC Pipeline
Question: Given a file myprogram.c, write the exact GCC commands to:
- Produce the preprocessed file (
myprogram.i) - Produce the assembly file (
myprogram.s) - Produce the object file (
myprogram.o) - Produce the final executable with debug symbols
Answer:
# 1. Preprocess
gcc -E myprogram.c -o myprogram.i
# 2. Compile to assembly
gcc -S myprogram.c -o myprogram.s
# 3. Assemble to object
gcc -c myprogram.c -o myprogram.o
# 4. Full compilation with debug
gcc -g myprogram.c -o myprogram
Problem 2: Makefile
Question: Write a Makefile for a project with main.c, utils.c, and utils.h that compiles to myprogram. Include a clean target.
Answer:
CC = gcc
CFLAGS = -Wall -g
SOURCES = main.c utils.c
OBJECTS = $(SOURCES:.c=.o)
TARGET = myprogram
all: $(TARGET)
$(TARGET): $(OBJECTS)
$(CC) $(CFLAGS) $(OBJECTS) -o $(TARGET)
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
clean:
rm -f $(OBJECTS) $(TARGET)
.PHONY: all clean
Problem 3: Addressing Modes
Question: Evaluate the following addressing mode expressions. Assume %rax = 0x1000, %rcx = 0x10, %rdx = 0x2.
(%rax)→ Address:0x10008(%rax)→ Address:0x1008(%rax, %rcx)→ Address:0x1010(0x1000 + 0x10)8(%rax, %rcx)→ Address:0x1018(0x1000 + 0x10 + 8)(%rax, %rdx, 4)→ Address:0x1008(0x1000 + 4*2)16(%rax, %rcx, 2)→ Address:0x1030(0x1000 + 2*0x10 + 16)
Answers:
0x10000x10080x10100x10180x10080x1030
Problem 4: Reverse Engineering
Question: What C function generated this assembly?
mystery:
xorl %eax, %eax # Zero %eax
xorl %ecx, %ecx # Zero %ecx
jmp .L2
.L1:
addl (%rdi, %rcx, 4), %eax # %eax += arr[i]
incl %ecx # i++
.L2:
cmpl %esi, %ecx # Compare i with n
jl .L1 # If i < n, loop
ret
Step-by-step:
- Zero %eax (sum accumulator) and %ecx (loop counter i)
- Jump to condition check
- Loop body: add arr[i] to sum, increment i
- Check if i < n; if so, loop
- Return sum in %eax
C code:
int mystery(int arr[], int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += arr[i];
}
return sum;
}
(This is a simple array sum!)
Common Exam Traps
1. AT&T vs. Intel Syntax
AT&T (GCC default): mov src, dst
movl %eax, %ebx # Copy %eax to %ebx
Intel syntax: mov dst, src (opposite!)
mov ebx, eax # Copy eax to ebx
Trap: Mentally flipping syntax can cause you to misread assembly. Always remember: GCC uses AT&T, source first.
2. Makefile Tab Indentation
WRONG:
myprogram: myprogram.c
gcc myprogram.c -o myprogram # Spaces!
RIGHT:
myprogram: myprogram.c
gcc myprogram.c -o myprogram # TAB!
Most text editors can be configured to use tabs for Makefiles. If you copy from a web example, be careful.
3. 32-bit Register Zero-Extension
Writing to a 32-bit register automatically zero-extends the upper 32 bits:
movl $0xFFFFFFFF, %eax
# %rax is now 0x00000000FFFFFFFF, not 0xFFFFFFFFFFFFFFFF
Why? This is a hardware feature for efficiency - the 32-bit operation automatically clears the upper bits.
But writing to 8-bit or 16-bit does NOT zero-extend:
movb $0xFF, %al
# %rax upper bits are UNCHANGED (not cleared)
4. lea Does NOT Access Memory
leaq (%rax), %rbx
# This stores the ADDRESS in %rbx, does not read from it
Compare:
movq (%rax), %rbx
# This reads the VALUE at address %rax, stores in %rbx
Trap: If you see lea, don't assume memory access happened.
5. xorl as Idiomatic Zero
xorl %eax, %eax # Sets %eax to 0
This is everywhere in real assembly. Recognize it instantly as "zero this register."
6. Stack Grows Downward
sub $8, %rsp # Allocate 8 bytes on stack
add $8, %rsp # Deallocate 8 bytes from stack
The stack pointer %rsp points to the top (smallest address) of the stack. When you allocate space, you subtract from %rsp. When you deallocate, you add back.
Trap: Students often write it backwards.
7. Instruction Suffixes Must Match Operand Size
movl %eax, %ebx # 32-bit: correct
movl %rax, %rbx # Wrong! %rax is 64-bit, should use movq
movq $42, %eax # Wrong! Can't mix 64-bit imm with 32-bit reg
8. sal and shl Are the Same
sall $2, %eax
shll $2, %eax
# These are identical (logical left shift = arithmetic left shift)
But sar (arithmetic right shift) and shr (logical right shift) differ:
sarl $1, %eax # Arithmetic right shift (sign-extends)
shrl $1, %eax # Logical right shift (zero-extends)
For unsigned numbers, they're the same. For signed numbers, use sar.
9. Arguments and Calling Convention
Never assume: The first argument is NOT always in %rax. It's in %rdi!
myfunction:
movl %edi, %eax # First argument is in %edi, copy to %eax
ret
Get the argument mapping wrong, and your reverse engineering will be completely wrong.
10. Memory-to-Memory is Illegal
movl (%rax), (%rbx) # WRONG! Illegal instruction
You must use a register:
movl (%rax), %ecx
movl %ecx, (%rbx)
Summary: Key Takeaways
- GCC Pipeline: Preprocess → Compile → Assemble → Link
- Make: Automates builds; use Makefiles with TAB indentation
- Registers: 16 general-purpose 64-bit registers; memorize their names and purposes
- Data sizes: b=byte, w=word, l=long(4), q=quad(8); match suffixes to register sizes
- mov: Copies data; source first (AT&T); can't do memory-to-memory
- Addressing modes: Direct, indirect, base+displacement, indexed, scaled
- lea: Computes addresses without reading memory; useful for arithmetic
- Arithmetic: add, sub, imul, and, or, xor, shifts; follow src,dst pattern
- Reverse engineering: Track arguments, return value, and operations carefully
- Calling convention: First 6 args in %rdi, %rsi, %rdx, %rcx, %r8, %r9; return in %rax
Quick Reference: Common Instructions
# Data Movement
movb, movw, movl, movq # Move (byte, word, long, quad)
leaq # Load effective address (no memory access!)
# Arithmetic
addl, addq # Add
subl, subq # Subtract
imull, imulq # Signed multiply
incl, incq # Increment
decl, decq # Decrement
negl, negq # Negate
# Logical
andl, andq # Bitwise AND
orl, orq # Bitwise OR
xorl, xorq # Bitwise XOR
notl, notq # Bitwise NOT
# Shifts
sall, salq # Left shift (same as shl)
shll, shlq
sarl, sarq # Arithmetic right shift (sign-extends)
shrl, shrq # Logical right shift (zero-extends)
# Control Flow (not covered in detail here, but you'll see them)
cmp # Compare
jl, jle, jg, jge, je, jne # Conditional jumps
jmp # Unconditional jump
ret # Return from function
Final Exam Advice
- Memorize the register table. You'll refer to it constantly when reading assembly.
- Practice reverse engineering. Do at least 10 examples before the exam. It's a skill that takes practice.
- Watch for idioms. Assembly code has patterns:
xorl reg, regfor zero,leaqfor multiplication, etc. - Trace execution step-by-step. Don't try to understand assembly by glancing at it. Trace each instruction carefully.
- Test your Makefiles. Make one mistake (spaces instead of tab) and your Makefile won't work. Test before the exam.
- Understand the pipeline, not just memorize. Know why there are 4 stages and what each does.
Good luck on the exam!