Relax and Unwind
Part 2: Tracing the Stack
— 5 minLast time, we were looking at how calls are actually implemented in native assembly code, and what kind of instructions the CPU is executing and what registers are involved.
We do know that the rsp
register (the stack pointer) points to the top of
the stack, and we learned that the call
instruction pushes the next
instruction onto the stack, which then becomes the bottom of the new stack
frame, and the ret
instruction jumps to whatever instruction address is on
top of the stack. In between though, the stack pointer can move as the function
will push and pop things onto and from the stack. So the position of the
return address relative to the current stack pointer will change. Most of the
times statically, but sometimes even dynamically, but I will come back to that
in a later part of this series.
Anyhow, it seems we are a bit stuck. We only know where the current stack frame ends (the stack pointer), we don’t really know where it began (where the return address is).
So lets go back a bit and look at the special purpose registers again.
One of them is called rbp
, which stands for base pointer. It turns out that
in older times, this base pointer served exactly this purpose. Lets look at how
it is used by passing the -Cforce-frame-pointers=yes
option to our Rust
compiler. This is the output that the Compiler Explorer
gives me:
example::never:
push rbp
mov rbp, rsp
call example::gonna
pop rbp
ret
example::gonna:
push rbp
mov rbp, rsp
call example::give
pop rbp
ret
example::give:
push rbp
mov rbp, rsp
pop rbp
ret
I will also try to visualize how the stack actually looks like in that case.
Also note that I drew the stack top to bottom, as it actually grows high to low.
Anyhow, what we see is that rbp
points to the position in the stack that has
the parents base pointer. And exactly below that (actually above when
talking of addresses) is the return address.
┌────────────────────────┐
│ return addr of parent │
↱│ bp of parent │
│├────────────────────────┤
││ return addr `never` │
└│ bp of `never` │↰
├────────────────────────┤│
│ return addr `gonna` ││
rbp → │ bp of `gonna` │┘
│ ... locals of `give` │
rsp → │ ... │
└────────────────────────┘
# Walking the Stack
Alright, lets try to actually walk that stack of a live program.
First, we need to actually read the rsp
, rbp
(and rip
) registers. For
this I will play a bit with unsafe Rust features, namely the nightly-only asm
feature.
let mut sp: usize;
let mut bp: usize;
let mut ip: usize;
unsafe {
asm!(
"mov {sp}, rsp",
"mov {bp}, rbp",
"lea {ip}, [rip]",
sp = out(reg) sp,
bp = out(reg) bp,
ip = out(reg) ip,
options(nomem, nostack),
);
}
Here we have a snippet of inline assembly which just moves the current values
of the registers into local Rust variables. Reading rip
is a bit more involved,
but that is a completely different story.
Now that we have the base pointer, we can use it to walk up the stack using this simple code:
for _ in 0..=5 {
ip = unsafe { (bp as *const usize).offset(1).read() };
bp = unsafe { (bp as *const usize).offset(0).read() };
println!("bp: {:#018x}; ip: {:#018x}", bp, ip);
}
That is basically all there is to walking up the stack, if you have a valid base pointer to work with.
I double-checked my output and it is the same as in produced by std::backtrace
.
You can run the full example in the compiler explorer
as well.
While working on this, I had a super hard time and couldn’t get this to work on Windows. In the end, I think that I hit a miscompilation bug in Rust. I filed a bug about it.
Instead of writing the current stack pointer into the base pointer register on function start, and growing the stack afterward, the code generated on windows first adjusts the stack pointer, and then tries to write the base pointer, adjusting for the offset again, though it seems that it messes up that offset sometimes.
Update
A comment on the issue I filed said that on Windows, the base pointer might as well be offset. And that offset can be looked up in the Unwind Info for that particular DLL/function. We will take a look at how to read that unwind info next time.
# What if there is no base pointer
In my previous post, the examples that I showed did not use any base pointer.
And also today I had to use an explicit -Cforce-frame-pointers=yes
compiler
flag to make it use a frame pointer. This shows clearly that you don’t really
need to maintain a base pointer when running the program.
So Rust does not maintain one by default. And I remember the oblivious
-fomit-frame-pointer
compiler options back when I didn’t know what it actually
does. So now we know. It avoids a few instructions per function call, saves a
bit of space on the stack, and frees up the rbp
register. Which I would
argue is the main reason, since general purpose registers are really scarce on
x64, it makes sense to have one more available for use.
But from a stack unwinding perspective, it complicates things, since we don’t have all the information we need to unwind right there in the registers or on the stack.
Essentially all we need is the position of the return address relative to the stack pointer (which still is a special purpose register). And we will look at where to get this information the next time.