Swatinem Blog Resume

Force Unwind Tables

— 4 min

I was recently investigating a customer issue where I got stuck at the point where I simply couldn’t seem to find any kind of unwind information for a specific piece of code, while the rest of the code had such information. My conclusion was that it could potentially be a problem in the customers build chain. I found a similar problem looking at a truncated stack trace on Android.

To better understand how this can happen, I tried to reproduce this myself. So lets build a program that has debug and unwind information for parts of the program, but not others.

// opt.c
void fn_without_debuginfo()
{
    print_backtrace();
}

// main.c
void indirect_call()
{
    printf("=== indirect call ===\n");
    fn_without_debuginfo();
}

void direct_call()
{
    printf("=== direct call ===\n");
    print_backtrace();
}

void main()
{
    direct_call();
    indirect_call();
}

I split my program into two compile units, which I compile with different flags before I link them into the final executable:

	gcc -c opt.c -Os -fno-asynchronous-unwind-tables -fno-optimize-sibling-calls
	gcc -c main.c -g
	gcc -o foo main.o opt.o -ldl -rdynamic

Here, -fno-optimize-sibling-calls avoids gcc being smart and actually tail-call-optimizing my whole fn_without_debuginfo away. I also needed -rdynamic when linking the executable in order to be able to symbolicate the stack trace.

The important piece here is -fno-asynchronous-unwind-tables which instructs gcc to avoid creating the .eh_frame section which contains the unwind tables. Also, the compile unit is built without debug information.

Running my executable yields the expected (broken) results:

=== direct call ===
0 - 0x0x55f80c10e1b7 (+0x0x11b7) - print_backtrace
1 - 0x0x55f80c10e2e1 (+0x0x12e1) - direct_call
2 - 0x0x55f80c10e2f2 (+0x0x12f2) - main
3 - 0x0x7f94c555db25 (+0x0x27b25) - __libc_start_main
4 - 0x0x55f80c10e0be (+0x0x10be) - _start

=== indirect call ===
0 - 0x0x55f80c10e1b7 (+0x0x11b7) - print_backtrace
1 - 0x0x55f80c10e307 (+0x0x1307) - fn_without_debuginfo

The stack trace is truncated after my fn_without_debuginfo, exactly as intended, except, it is not what you expect when using a tool such as sentry.

# Lose some Weight

It turns out, all this information can lead to a bit of binary bloat. I have seen reports that unwind information can take up as much as 10% of the resulting binary size. So in cases where binary size matters, which it especially does for mobile and embedded, there are a few tutorials online that advocate to just completely remove the whole .eh_frame section.

# Small and Big Unwind Information

Coming back to my example, I compiled part of my program with the -g switch which turned on detailed debug information.

This detailed debug info contains all the details to allow debuggers to show which local variables are defined, and where on the stack or in which registers they can be found, among other information.

I will call this the big unwind information. And yes, they are huge, sometimes 2x or even 10x the binary size.

These are however not needed at runtime, and it is best practice to split them apart from the actual binary. They also contain sensitive details about the codebase ;-)

The small .eh_frame remains in the binary, as you might need it, such as to create a stack trace as in our example. But you can also safely remove it in some cases, more on that later.


Another interesting case is statically linked libraries. How do you know with what flags they were compiled? Do they have both kinds of unwind information or not? How would I know when looking at libc.a that comes with the arch musl package?


To summarize, there is two different sets of unwind information, big and small, one that is shipped together with the executable, and the other not. And you can end up in situations where parts of your program have either one, both or neither.

This is also the reason why, when working with sentry, it is important to upload both the executable, and the accompanying debug file.

# Rust

I actually hit a similar problem recently while playing around with -C panic=abort in Rust. I was surprised that the panic! backtrace was truncated and filed an issue about it. Turns out that Rust by default will avoid creating unwind info when compiled with panic=abort. This option is frequently recommended to avoid binary bloat. And sure enough, if you don’t want to ever catch a panic!, you don’t need it, however you will also lose the ability to get a meaningful stack trace.

You can get that ability back by using -C force-unwind-tables.

In this case I must say that I disagree with Rusts behavior, as catching panics and creating stack traces are two different concerns. And panic=abort is very frequently used. So remember to use -C force-unwind-tables if you care about stack traces but not about catching panics.