Description of problem: Since a recent upgrade to valgrind, it is very slow to start up on certain binaries. For example: $ valgrind --log-file=/dev/null /bin/gcc <---- very long pause here gcc: fatal error: no input files compilation terminated. It takes 7.6 seconds to load the GCC binary on my machine. Other binaries are not affected so much (eg. /bin/ls is quite fast). I have another program which I'm testing which also shows the same slow start-up behaviour, and I modified it to insert a print statement at the beginning of main(). The pause appears to happen before main() is entered. Version-Release number of selected component (if applicable): valgrind-3.14.0-7.fc29.x86_64 How reproducible: Unknown - only happens on some machines. Steps to Reproduce: 1. See description above.
Created attachment 1517584 [details] perf report perf report from: # perf record -g -F 99 -- valgrind --log-file=/dev/null /bin/gcc
Ah ha! I've worked out what changed which caused the slowdown. It was the installation of debuginfo for the binary. On another machine of mine: $ time valgrind --log-file=/dev/null /bin/gcc gcc: fatal error: no input files compilation terminated. real 0m0.863s user 0m0.674s sys 0m0.035s Then after installing gcc-debuginfo: $ time valgrind --log-file=/dev/null /bin/gcc gcc: fatal error: no input files compilation terminated. real 0m18.058s user 0m17.949s sys 0m0.054s
Yeah valgrind is slow reading debuginfo and somewhat eager. Although this seems somewhat excessive. Thanks for the perf report. It looks like a lot of time goes into vgPlain_debugLog_vprintf. I wonder if valgrind is trying to tell us something about the debuginfo it is seeing. Could you run without --log-file=/dev/null or put the output in a file and attach it if there is something odd in it? Also could you try with --read-inline-info=no to see if it might be related to getting better backtraces?
The complete output without --log-file is: ==13709== Memcheck, a memory error detector ==13709== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==13709== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info ==13709== Command: /bin/gcc ==13709== gcc: fatal error: no input files compilation terminated. ==13709== ==13709== HEAP SUMMARY: ==13709== in use at exit: 169,913 bytes in 43 blocks ==13709== total heap usage: 187 allocs, 144 frees, 197,242 bytes allocated ==13709== ==13709== LEAK SUMMARY: ==13709== definitely lost: 4,253 bytes in 4 blocks ==13709== indirectly lost: 0 bytes in 0 blocks ==13709== possibly lost: 0 bytes in 0 blocks ==13709== still reachable: 165,660 bytes in 39 blocks ==13709== suppressed: 0 bytes in 0 blocks ==13709== Rerun with --leak-check=full to see details of leaked memory ==13709== ==13709== For counts of detected and suppressed errors, rerun with: -v ==13709== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) There don't appear to be any errors about bad debuginfo. The --read-inline-info option DOES make a big difference: $ time valgrind --log-file=/dev/null --read-inline-info=yes /bin/gcc gcc: fatal error: no input files compilation terminated. real 0m7.904s user 0m7.840s sys 0m0.048s $ time valgrind --log-file=/dev/null --read-inline-info=no /bin/gcc gcc: fatal error: no input files compilation terminated. real 0m0.511s user 0m0.475s sys 0m0.035s
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to '31'.
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to 31.
We have to discuss upstream what to do about this. The defaults are now so that more debuginfo is read, which does improve warning/error reporting. But the slowdown is pretty big.
This message is a reminder that Fedora 31 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '31'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 31 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This has become more urgent now that valgrind (3.17.0) supports debuginfod and that has been enabled by default in rawhide: https://fedoraproject.org/wiki/Changes/DebuginfodByDefault
The debuginfod by default behaviour certainly feels like a good step for something like GDB because that's not so performance critical in the paths where it acquires the data. For valgrind though, I think maintaining high performance is more important than use of debuginfod, because valgrind already has a significant penalty compared to native execution. So if we can't get the valgrind performance improved in the near term (ie before F35 release GA), can we consider disabling use of debuginfod by default, specifically for valgrind only, as a temporary measure ?
Yes, if we cannot make the valgrind debuginfo reader faster before F35 GA then we should disable debuginfod downloading by default for valgrind. But I believe I can fix it (or at least make it a lot faster). But now I like to keep it enabled in rawhide because it already found some issues.
FEDORA-2021-04b130ee33 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2021-04b130ee33
Please test the proposed update valgrind-3.17.0-13.fc35. For the original testcase, valgrind -q /usr/bin/gcc, on my local system, with gcc-debuginfo installed, runtime goes from 12 seconds to 0.6 seconds. Without any debuginfo installed runtime under valgrind is 0.4 seconds. So debuginfo reading for the default case should now have an overhead of only ~0.2 seconds.
I'm using tests/test-nbd-tls.sh from nbdkit. https://gitlab.com/nbdkit/nbdkit/-/blob/master/tests/test-nbd-tls.sh With DEBUGINFOD_URLS unset this test takes 9 seconds, so I'll take that as my baseline. Using nbdkit + this commit reverted: https://gitlab.com/nbdkit/nbdkit/-/commit/8c94611ca3854c4e With valgrind-3.17.0-11.fc35.x86_64 the following test takes 30 seconds: $ make && time make check-valgrind TESTS=test-nbd-tls.sh With valgrind-3.17.0-13.fc36.x86_64 it takes about 15 seconds (consistent over several runs). It seems as if debuginfod is still taking about 6 seconds, although it's an improvement on before. Oh wait! I don't actually have elfutils-debuginfod installed! Let's try that again with debuginfod installed and running ... It still takes 15 seconds. I would say it is an improvement, but there may still be something I'm missing here. I have a few questions: - Does debuginfod globally install the debuginfo RPMs that it downloads? (It seems like "no", because the above test uses gnutls but I do not see any gnutls-debuginfo* packages installed.) - Does debuginfod cache the RPMs, so at least they don't need to be downloaded twice?
(In reply to Richard W.M. Jones from comment #14) > I'm using tests/test-nbd-tls.sh from nbdkit. > https://gitlab.com/nbdkit/nbdkit/-/blob/master/tests/test-nbd-tls.sh > With DEBUGINFOD_URLS unset this test takes 9 seconds, so I'll take > that as my baseline. > > Using nbdkit + this commit reverted: > https://gitlab.com/nbdkit/nbdkit/-/commit/8c94611ca3854c4e > > With valgrind-3.17.0-11.fc35.x86_64 the following test takes 30 seconds: > > $ make && time make check-valgrind TESTS=test-nbd-tls.sh > > With valgrind-3.17.0-13.fc36.x86_64 it takes about 15 seconds > (consistent over several runs). > > It seems as if debuginfod is still taking about 6 seconds, although > it's an improvement on before. > > Oh wait! I don't actually have elfutils-debuginfod installed! > Let's try that again with debuginfod installed and running ... > > It still takes 15 seconds. > > I would say it is an improvement, but there may still be something > I'm missing here. The 3.17.0-133 patches improve debuginfo reading however it is installed (either build into the binary, installed as separate debuginfo packages or automatically downloaded through debuginfod). From the above I don't really know which scenario you are testing. You can see if the patches help by looking at the different between running valgrind --read-inline-info=yes (the default) and valgrind --read-inline-info=no. There is also --read-var-info=yes/no (but that defaults to no, so should not impact the normal case unless explicitly enabled). > I have a few questions: > > - Does debuginfod globally install the debuginfo RPMs that it downloads? > (It seems like "no", because the above test uses gnutls but I do not > see any gnutls-debuginfo* packages installed.) debuginfod-client (which valgrind uses when installed and DEBUGINFOD_URLS is set) doesn't globally install debuginfo RPMS. It downloads individual .debug/dwz files (and source files, but valgrind never requests those) under $XDG_CACHE_HOME/debuginfod_client/ (or $HOME/.cache/debuginfod_client/) > - Does debuginfod cache the RPMs, so at least they don't need to be > downloaded twice? See above. It does cache downloaded files, but those aren't RPMs.
I see a lot of <hash>/debuginfo files under $HOME/.cache/debuginfod_client/ some of which binary match "gnutls".
(In reply to Richard W.M. Jones from comment #16) > I see a lot of <hash>/debuginfo files under $HOME/.cache/debuginfod_client/ > some of which binary match "gnutls". That is good, those will be used when you have debuginfod-client installed and DEBUGINFOD_URLS set. But the question really is not about debuginfod but how well the updates to the valgrind DWARF reader is working. The best way to test this is to see (with debuginfo installed, either through rpms or through debuginfod) what the timing difference is between valgrind --read-inline-info=no and valgrind --read-inline-info=yes (the default).
FEDORA-2021-04b130ee33 has been pushed to the Fedora 35 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-04b130ee33` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-04b130ee33 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
I'm still unclear how this is supposed to work (why doesn't debuginfod just install the debuginfo RPMs?), or how --read-inline-info could affect debuginfo, however I did some simple testing and there's no obvious slowdown so I guess it's fixed.
(In reply to Richard W.M. Jones from comment #19) > I'm still unclear how this is supposed to work (why doesn't debuginfod > just install the debuginfo RPMs?), or how --read-inline-info could affect > debuginfo, however I did some simple testing and there's no obvious > slowdown so I guess it's fixed. Thanks for testing. Lets see if I can explain how "this" works. The fix itself has nothing to do with how you got the debuginfo associated with the binary (and libraries) you want to run under valgrind. Lets just assume the debuginfo is there, either build into the binary itself, installed through (system wide) rpms, or through debuginfod (because you installed elfutils-debuginfod-client and have DEBUGINFOD_URLS set). The --read-inline-info=yes setting is the default for valgrind (in most cases). This is the description from the manual: --read-inline-info=<yes|no> [default: see below] When enabled, Valgrind will read information about inlined function calls from DWARF3 debug info. This slows Valgrind startup and makes it use more memory (typically for each inlined piece of code, 6 words and space for the function name), but it results in more descriptive stacktraces. Currently, this functionality is enabled by default only for Linux, Android and Solaris targets and only for the tools Memcheck, Massif, Helgrind and DRD. Here is an example of some stacktraces with --read-inline-info=no: ==15380== Conditional jump or move depends on uninitialised value(s) ==15380== at 0x80484EA: main (inlinfo.c:6) ==15380== ==15380== Conditional jump or move depends on uninitialised value(s) ==15380== at 0x8048550: fun_noninline (inlinfo.c:6) ==15380== by 0x804850E: main (inlinfo.c:34) ==15380== ==15380== Conditional jump or move depends on uninitialised value(s) ==15380== at 0x8048520: main (inlinfo.c:6) And here are the same errors with --read-inline-info=yes: ==15377== Conditional jump or move depends on uninitialised value(s) ==15377== at 0x80484EA: fun_d (inlinfo.c:6) ==15377== by 0x80484EA: fun_c (inlinfo.c:14) ==15377== by 0x80484EA: fun_b (inlinfo.c:20) ==15377== by 0x80484EA: fun_a (inlinfo.c:26) ==15377== by 0x80484EA: main (inlinfo.c:33) ==15377== ==15377== Conditional jump or move depends on uninitialised value(s) ==15377== at 0x8048550: fun_d (inlinfo.c:6) ==15377== by 0x8048550: fun_noninline (inlinfo.c:41) ==15377== by 0x804850E: main (inlinfo.c:34) ==15377== ==15377== Conditional jump or move depends on uninitialised value(s) ==15377== at 0x8048520: fun_d (inlinfo.c:6) ==15377== by 0x8048520: main (inlinfo.c:35) So if the debuginfo is available then valgrind is able to provide much more descriptive backtraces for issues. But this does come with a startup penalty. This fix makes this penalty much lower (from several seconds to less than a second). If the patches work well it should be hard to even measure the (extra) slowdown from having debuginfo available. Before the patch there could be a significant different between running with the default --read-inline-info=yes and --read-inline-info=no. This doesn't really have much to do with debuginfod, except that because valgrind now uses and recommends elfutils-debuginfod-client and on Fedora 35 DEBUGINFOD_URLS is setup by default it is more likely the debuginfo (or at least the system libraries) for your program is available. For how debuginfod works and why it provides individual debug files and not whole RPMs see https://fedoraproject.org/wiki/Debuginfod and https://sourceware.org/elfutils/Debuginfod.html
Thanks - I understand better now. I was confused because I thought the fix was something to do with the time taken to download the debuginfo data.
FEDORA-2021-eb2442ba01 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-eb2442ba01
FEDORA-2021-04b130ee33 has been pushed to the Fedora 35 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2021-eb2442ba01 has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-eb2442ba01` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-eb2442ba01 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2021-eb2442ba01 has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.