You’ve mastered “Hello, Assembly!” and peeked inside registers. Now let’s plunge into pointers, arrays, function calls, iteration, and memory regions. You’ll learn how ARM64 turns high-level constructs into pointer arithmetic, branches, and stack frames—and why overstepping your allocated buffer immediately triggers a fault.
Pointers & Arrays: Address, Address, Address
In C, an array is just a pointer under the hood:
int nums[] = {10, 20, 30};
printf("%d\n", nums[1]);
In ARM64 assembly, you load the base address of nums
with a two-instruction sequence:
adrp x0, nums@PAGE // load page base of nums
add x0, x0, nums@PAGEOFF // add the low-12-bit offset → &nums[0]
ldr w1, [x0, #4] // load nums[1]
How ADRP/ADD Works
ARM64 splits a full 64-bit address into:
full 64-bit address (0x1234_5678_9ABC_DEF0)
────────────────────────────────────────────────
| PAGE (top 52 bits) │ OFFSET (12 bits) │
| 0x1234_5678_9ABC │ 0xDEF0 │
────────────────────────────────────────────────
- ADRP loads the page base (zeroing low 12 bits).
- ADD …@PAGEOFF tacks on the low-12-bit offset.
- One page = 4 KB = 2¹² bytes, so 12 bits cover every byte inside.
Visualizing a Page
Page boundary ───────────────────────────────────────────────
┌───────────────────────────────────┐
│ PAGE = 0x1234_5678_9ABC │ ← adrp x0, nums@PAGE
│ │
│ [0xDEF0] nums[0] → offset 0xDEF0 │ ← add x0, x0, nums@PAGEOFF
│ [0xDEF4] nums[1] → offset 0xDEF4 │
│ [0xDEF8] nums[2] → offset 0xDEF8 │
└───────────────────────────────────┘
↑ lower addresses ↑ higher
Multi-Page Buffers
If your data spans >1 page (e.g., a 6 KB JPEG buffer), two pages hold it:
Buffer: 0x..._9ABC_1000 – 0x..._9ABD_017F (6 KB)
──────────────────────────────────────────────────────────
PAGE 1: 0x..._9ABC [offsets 0x1000–0x1FFF] (4 KB)
PAGE 2: 0x..._9ABD [offsets 0x0000–0x017F] (2 KB)
Your loop doesn’t need special page-crossing code in user mode—the MMU makes the next virtual page accessible automatically. If you stray outside your allocated range (stack, heap, or data segment), the CPU’s safety checks trigger a fault (segfault), stopping you in your tracks.
Iteration & Branching: Comparison Operators
Loops in assembly boil down to compare + branch:
cmp rn, rm
comparesrn – rm
(sets flags).subs rd, rn, #imm
subtracts immediate, sets flags (and writes result tord
).- Branches:
b.eq
/b.ne
→ equal / not equalb.lt
/b.ge
→ signed less-than / signed greater-or-equalb.lo
/b.hs
→ unsigned less-than / unsigned ≥
Example: walking an array
adrp x0, nums@PAGE
add x0, x0, nums@PAGEOFF
mov x2, #3 // count = 3
loop:
ldr w3, [x0], #4 // load then x0 += 4
// … process w3 …
subs x2, x2, #1 // x2 = x2 – 1
b.ne loop // if x2 ≠ 0, jump to loop
Functions & Calling Conventions
High-level calls become a choreographed dance:
- Prologue: set up a stack frame
- Body: use
x0–x7
for args/return, save callee-saved registers - Epilogue: tear down the frame and
ret
.global _sum
_sum: // ← label: a branch target, like loop: above
stp x29, x30, [sp, #-16]!// push FP (x29) & LR (x30)
mov x29, sp // new frame pointer
// x0 = arr ptr, x1 = count
mov w2, #0 // total = 0
mov w3, #0 // i = 0
loop_sum:
cmp w3, w1 // compare i and count
b.ge done_sum // if i ≥ count, break
ldr w4, [x0, w3, lsl #2] // w4 = arr[i]
add w2, w2, w4 // total += w4
add w3, w3, #1 // i++
b loop_sum // goto loop_sum
done_sum:
mov w0, w2 // return total in w0
ldp x29, x30, [sp], #16 // restore FP & LR, pop stack
ret // jump to address in LR
- Labels like
_sum:
,loop_sum:
, anddone_sum:
mark code positions for branches. No direct analog in Swift/ObjC—think of them as named “goto” anchors under the hood. - Storing/restoring
x29
(frame pointer) andx30
(link register) creates a private workspace. - Return uses
ret
, which branches to the address inx30
.
Heap vs. Stack: Memory Management
- Stack: auto-managed LIFO region via
sp
. Fast, but limited and prone to overflow if you reserve too much. - Heap:
malloc
/free
under the hood (via syscalls). Grows upward; must avoid fragmentation.
// Reserve 32 bytes on stack
sub sp, sp, #32
str x0, [sp, #0] // store a local var
// … use locals …
add sp, sp, #32 // release stack
// Call malloc(64)
mov x0, #64
bl _malloc // link-branch to libc’s malloc
// x0 = pointer to heap memory
Putting it all together
Below are the two assembly files that compute and print “60” (in hex, 3c
) every time.
sum.s
This is sum function, it expects two parameters
- pointer to
Int
array inx1
- count of elements in
x1
Loops over values, and leaves result in x0
// sum.s
.section __TEXT,__text
.global _sum
_sum:
// Prologue: save FP & LR
stp x29, x30, [sp,#-16]!
mov x29, sp
mov x2, #0 // total = 0
loop:
cbz w1, done // if count==0, exit
ldr x3, [x0],#8 // load *ptr → x3; ptr += 8
add x2, x2, x3 // total += x3
subs w1, w1,#1 // count--, set flags
b.ne loop // if count!=0, repeat
done:
mov x0, x2 // return total in x0
// Epilogue: restore FP & LR
ldp x29, x30, [sp],#16
ret
main.s
Main file prepares array, put arguments to proper register, calls our method, and prints result. You can freely change array and count value to experiment!
.section __TEXT,__text
.globl _main
.extern _sum
.extern _printf
.align 2
_main:
// — Prologue: save FP & LR, allocate 16 bytes
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16
// — Compute sum(arr,3) into x0 —
adrp x0, arr@PAGE
add x0, x0, arr@PAGEOFF
mov w1, #3
bl _sum // returns sum in x0
// — Store sum on stack for printf %d —
mov x8, x0 // move full 64-bit sum into x8
str x8, [sp] // write sum at [sp]
// — Print it as hex —
adrp x0, Lstr@PAGE
add x0, x0, Lstr@PAGEOFF
bl _printf
// — Teardown & return 0 —
add sp, sp, #16
ldp x29, x30, [sp], #16
mov w0, #0
ret
// — Data: format string then array —
.section __TEXT,__cstring
.align 2
Lstr:
.asciz "Sum = %d\n"
.align 3 // ensure arr is 8-byte aligned
arr:
.quad 10, 20, 30 // 3×64-bit integers
Extras
If you are using Xcode, as in first post - try command line! Make sure, you have clang installed!
clang -arch arm64 main.s sum.s -o sum_demo
./sum_demo
Why? Xcode does not provide clear errors for assembler code. You have compiler error? It will jus says, that exit code was not 0
.
CLI will provide line number and type of error, very good for debugging!
Also, Xcode can step by step debug assembler! and lldb
can print registers!
Try this next time:
register read x1
Happy inspecting!
Closing Thoughts
You now understand:
- How ADRP/ADD constructs full addresses (and why 4 KB pages matter).
- How to iterate with
cmp
/subs
+b.*
branches. - How stack frames and labels underpin every function call.
- That any attempt to touch memory outside your continuous allocation is caught by the MMU and results in a fault.
- CLI and debugging!
In our final part, we’ll wield bitwise ops to filter JPEG data and sprinkle inline assembly into a real iOS example. Until then, write small tests: walk a buffer byte-by-byte, implement your own strlen
, and debug with the Xcode disassembler open. Happy hacking!