Finding loaded libraries on Linux— 7 min
Well, I am still very much procrastinating on writing the next blog post in my Relax and Unwind series about writing a stack unwinder from scratch.
However, todays topic is a prerequisite for that. We will take a look at how we can get a list of loaded libraries on Linux.
Usually, the platform will provide the necessary APIs to get a list of loaded libraries directly from the dynamic loader that is responsible to load them. On Windows, you have the Tool Help Library, and on Apple platforms you have some dyld functions available.
For better or for worse, on Linux there are no standardized userspace tools.
GNU/Linux has the
dl_iterate_phdr function for this purpose, but
that is notably not available on ancient Android systems (The
Bionic Status lists the API as available starting with API 21,
aka Android 5, released end of 2014).
So if you have to support ancient Android versions, which unfortunately we have
to, you will need to get the list of loaded libraries from somewhere else.
It seems the state of the art is to parse the memory map info from
and try to find the mapped elf files that way. It is what Breakpad
in two places,
Crashpad, Androids libunwindstack and
LLDB do. I think another reason these tools do it that way is
because some of them are outside observers, that just can’t query the dynamic
loader from inside the process.
This approach is also what I implemented for sentry-native as well.
However that implementation was rather conservative and did not catch all the
cases, most notably
.so files loaded directly from inside Android
So I re-thought the approach to support more cases, and want to document my approach, and a few interesting cases that I found here.
The format for these
/proc/X/maps is documented in a manpage here.
It includes the start/end of the virtual address space covered by the mapping,
as well as permission information, and information about the inode (file) it is
coming from, and the offset inside that file.
On Linux, all the executables and libraries have the ELF format. The was recently a really great post on the Cloudflare Blog that explained the ELF format, and how a loader parses and processes it in great detail.
There are cases when a library uses just one mapping, but most of the time, it is split into two or more mappings. Usually a read-only mapping that includes the ELF headers and some metadata, and an executable mapping that holds the actual program code.
On my Linux system, I saw up to 6 mappings for a single file:
7f8cd3467000-7f8cd3475000 r--p 00000000 00:1c 7597971 /usr/lib/libcurl.so.4.7.0 7f8cd3475000-7f8cd34da000 r-xp 0000e000 00:1c 7597971 /usr/lib/libcurl.so.4.7.0 7f8cd34da000-7f8cd34f6000 r--p 00073000 00:1c 7597971 /usr/lib/libcurl.so.4.7.0 7f8cd34f6000-7f8cd34f7000 ---p 0008f000 00:1c 7597971 /usr/lib/libcurl.so.4.7.0 7f8cd34f7000-7f8cd34fa000 r--p 0008f000 00:1c 7597971 /usr/lib/libcurl.so.4.7.0 7f8cd34fa000-7f8cd34fc000 rw-p 00092000 00:1c 7597971 /usr/lib/libcurl.so.4.7.0
The interesting case here is that the 4th mapping is not readable, and basically creates a gap in the address space.
Another interesting case I found on Android:
737b5570d000-737b5570e000 r--p 00000000 07:70 34 /apex/com.android.runtime/lib64/bionic/libdl.so 737b5570e000-737b5570f000 r-xp 00000000 07:70 34 /apex/com.android.runtime/lib64/bionic/libdl.so 737b5570f000-737b55710000 r--p 00000000 07:70 34 /apex/com.android.runtime/lib64/bionic/libdl.so
Here, the same file at the same offset is mapped onto different address ranges.
The way that the Android loader loads libraries directly from apks is also interesting. Compare the following two mappings, which load the exact same libraries, once extracted to disk, once directly from the apk:
77a85dbda000-77a85dbdd000 r-xp 00000000 fd:05 40992 /data/app/x/y/lib/x86_64/libsentry-android.so 77a85dbdd000-77a85dbde000 ---p 00000000 00:00 0 77a85dbde000-77a85dbdf000 r--p 00003000 fd:05 40992 /data/app/x/y/lib/x86_64/libsentry-android.so 77a85dc15000-77a85dd6c000 r-xp 00000000 fd:05 40991 /data/app/x/y/lib/x86_64/libsentry.so 77a85dd6c000-77a85dd6d000 ---p 00000000 00:00 0 77a85dd6d000-77a85dd79000 r--p 00157000 fd:05 40991 /data/app/x/y/lib/x86_64/libsentry.so 77a85dd79000-77a85dd7a000 rw-p 00163000 fd:05 40991 /data/app/x/y/lib/x86_64/libsentry.so
77a85dbf0000-77a85dbf3000 r-xp 00001000 fd:05 40977 /data/app/x/y/base.apk 77a85dbf3000-77a85dbf4000 ---p 00000000 00:00 0 77a85dbf4000-77a85dbf5000 r--p 00004000 fd:05 40977 /data/app/x/y/base.apk 77a85dc15000-77a85dd6c000 r-xp 00006000 fd:05 40977 /data/app/x/y/base.apk 77a85dd6c000-77a85dd6d000 ---p 00000000 00:00 0 77a85dd6d000-77a85dd79000 r--p 0015d000 fd:05 40977 /data/app/x/y/base.apk 77a85dd79000-77a85dd7a000 rw-p 00169000 fd:05 40977 /data/app/x/y/base.apk
The mappings are basically the same, just that in the case of the
the file offsets are different. Also, the Android loader inserts a non-readable
gap in between.
# So how do we get the library list from there?
So far, the sentry-native modulefinder was a bit too conservative. Because of concerns reading arbitrary memory, we mmap-ed the file into memory and were trying to extract ELF headers from there, but that approach did not work with libraries loaded directly from apk files. Plus, there were some issues related to non-contiguous mappings and double mappings as we have seen above.
My new approach is to keep track of readable mappings that I have seen so far, keeping track of their file offsets and gaps in between them. For each readable mapping, I am looking for the magic ELF signature. If I find one, I process the previously saved mappings, also taking care of possible duplicates.
A possible issue is trying to read arbitrary memory. I think I’m pretty safe as
I only consider readable mappings, but one improvement would be to use
process_vm_readv for this, but I have also seen problems with
using that on Android.
Another unanswered question is how to correctly deal with mappings that have gaps in them, or that appear multiple times. Information embedded in the ELF file might instruct the loader to load the executable code at a specific offset to the ELF header in RAM, which might be different to the offset on disk. This very much depends on how we use this information to post-process crash reports.
All in all, this is a pretty much non-trivial problem, and I am far from the only
one struggling with it. It seems that the
libunwindstack that I mentioned above,
and which we vendor as our Unwinder on Android has issues itself, as it is unable
to correctly create a stacktrace that involves libraries loaded from an
We have also seen some breakpad tools getting this wrong and creating minidumps
with duplicated/invalid mappings that fail post-processing. It might be quite
some work to investigate those failures and patch the relevant external dependencies.
And through all this, I wish I could implement all of this in a sane language like Rust, and share that code across different steps of the pipeline. Oh well…