[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
profiling libunwind
From: |
Mark Nelson |
Subject: |
profiling libunwind |
Date: |
Wed, 13 Oct 2021 15:33:10 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 |
Hi list,
Recently I decided to re-implement my old gdb based poorman's (ie
wallclock) profiler in c++ using libunwind directly since we've started
having problems profiling Ceph using my hacky gdb version. I've got a
small prototype here:
https://github.com/markhpc/uwpmp
As soon as I got it working I took a stab at profiling a multithreaded
application (ceph-osd) with about 80 threads. It more or less worked
fine and was somewhat faster than using gdb to grab callstacks, but I
was hoping for more. I decided to profile the profiler while it
profiled. Initially (see profile 1) I saw a fairly large amount of time
spent lzma_decode for minidebuginfo along with time spent in
find_proc_info due to lack of caching. I disabled minidebuginfo and set
a global cache with size 1024 and that did wonders, with ~95% of
wallclock time now spent in get_proc_name (see profile 2). I'm now
trying to take this a bit further but wanted feedback from the list. In
the most recent trace, I see time in get_proc_name split mostly between
get_proc_name_in_image and get_elf_image:
||+ 94.50% _Uelf64_get_proc_name
|| + 51.60% _Uelf64_get_proc_name_in_image
|| |+ 2.60% __strncpy_avx2
|| + 33.60% _Ux86_64_get_elf_image
|| |+ 26.50% __read
|| |+ 2.30% __munmap
|| |+ 1.10% __open64
|| |+ 0.40% close
|| |+ 0.20% __fxstat64
|| |+ 0.20% mmap64
|| + 7.30% __munmap
|| + 2.00% _Uelf64_load_debuglink
|| + 1.00% _Uelf64_load_debuglink
|| |+ 1.00% __open64
|| + 1.00% _Uelf64_find_section
Looking at the code, it appears that at least for
get_proc_name_in_image, someone anticipated that this could be slow:
https://github.com/libunwind/libunwind/blob/master/src/elfxx.c#L271-L274
Most of the time in get_elf_image appears to be in __read. Seems like
another opportunity for gains? Does anyone have any advice before I dig
much deeper into this? Especially anyone else that has started going
down this path?
Thanks,
Mark
uwpmp_self_profile_2.txt
Description: Text document
uwpmp_self_profile_1.txt
Description: Text document
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- profiling libunwind,
Mark Nelson <=