Bug 1974357
Summary: | glibc pthreads updates break helgrind | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Mark Wielaard <mjw> |
Component: | valgrind | Assignee: | Mark Wielaard <mjw> |
valgrind sub component: | system-version | QA Contact: | Jesus Checa <jchecahi> |
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
Severity: | unspecified | ||
Priority: | unspecified | CC: | ahajkova, bugproxy, fche, fweimer, jakub, ohudlick, tstaudt, tuliom |
Version: | 9.0 | Keywords: | FutureFeature, Triaged |
Target Milestone: | beta | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | valgrind-3.17.0-8.el9 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-12-07 21:33:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1958224 | ||
Bug Blocks: |
Description
Mark Wielaard
2021-06-21 13:10:46 UTC
Please note that glibc 2.34 snapshots are not yet in the buildroot/composes. (In reply to Mark Wielaard from comment #0) > Another issue seen is: > > WARNING: could not find symbol for var stack_cache_actsize in libpthread.so.0 > > Which is in ld.so now, GL (dl_stack_cache_actsize). > Part of _rtld_global. If we need to disable the stack cache when running under valgrind, we can put that logic into glibc, instead of having valgrind patch glibc internals. (I understand there is a way for us to detect whether we are running under valgrind.) What do you think? (In reply to Florian Weimer from comment #2) > (In reply to Mark Wielaard from comment #0) > > Another issue seen is: > > > > WARNING: could not find symbol for var stack_cache_actsize in libpthread.so.0 > > > > Which is in ld.so now, GL (dl_stack_cache_actsize). > > Part of _rtld_global. > > If we need to disable the stack cache when running under valgrind, we can > put that logic into glibc, instead of having valgrind patch glibc internals. > (I understand there is a way for us to detect whether we are running under > valgrind.) > > What do you think? A supported way to do this would be great. The current documentation in valgrind reads: /* glibc nptl pthread systems only, when no-nptl-pthread-stackcache was given in --sim-hints. Used for a (kludgy) way to disable the cache of stacks as implemented in nptl glibc. Based on internal knowledge of the pthread glibc nptl/allocatestack.c code: a huge value in stack_cache_actsize (bigger than the constant stack_cache_maxsize) makes glibc believes the cache is full and so stacks are always released when a pthread terminates. Several ugliness in this kludge: * hardcodes private glibc var name "stack_cache_maxsize" * based on knowledge of the code of the functions queue_stack and __free_stacks * static symbol for "stack_cache_maxsize" must be in the debug info. It would be much cleaner to have a documented and supported way to disable the pthread stack cache. */ But note that this is only done when the no-nptl-pthread-stackcache SimHint is given, which isn't the default (but is enabled for some tests). To detect whether running under valgrind you can use the RUNNING_ON_VALGRIND macro defined in valgrind.h. ------- Comment From arnez.com 2021-06-22 12:56 EDT------- FWIW, I just tested Valgrind on RHEL9 on s390x with a patched glibc, using Florian's patch for glibc-2.33-18.fc34.src.rpm from this Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1965374 Stefan Liebler thankfully adjusted the patches to RHEL9. Using this changed glibc, the fails from helgrind/tests disappear, except for pth_destroy_cond, which still fails. ------- Comment From STLI.com 2021-06-23 08:38 EDT------- @Florian Weimer: I've just got a report that starting with glibc-2.33-18.fc34, perf fails to add a probe as it can't find the symbol, e.g. for inet_pton: perf probe -n -vvvvv -x /usr/lib64/libc-2.33.so -a inet_pton But it succeeds for global symbols in ".symtab" of /usr/lib64/libc-2.33.so like e.g. perf probe -n -vvvvv -x /usr/lib64/libc-2.33.so -a __memcpy_chk Note: /usr/lib64/libc-2.33.so has a ".symtab" symbol-table with the symbols discussed here: https://bugzilla.redhat.com/show_bug.cgi?id=1965374#c11 which match the "strip -K xyz" invocation in wrap-find-debuginfo.sh in the glibc-source-package. Compared to the older 2.33-8.fc34 package, /usr/lib64/libc-2.33.so does not have a ".symtab" symbol table and I assume perf is then searching the symbol in the libc.so from the glibc-debuginfo package. There it finds inet_pton. Can you please have a look? (In reply to IBM Bug Proxy from comment #7) > ------- Comment From STLI.com 2021-06-23 08:38 EDT------- > @Florian Weimer: > I've just got a report that starting with glibc-2.33-18.fc34, perf fails to > add a probe as it can't find the symbol, e.g. for inet_pton: > perf probe -n -vvvvv -x /usr/lib64/libc-2.33.so -a inet_pton > > But it succeeds for global symbols in ".symtab" of /usr/lib64/libc-2.33.so > like e.g. > perf probe -n -vvvvv -x /usr/lib64/libc-2.33.so -a __memcpy_chk This looks like a perf deficiency. According to the manual page, it support dynamic symbols. However, I think the general expectation is that perf probes require the installation of debuginfo packages. It maybe worthwhile to open a separate bug report for this. ------- Comment From TMRICHT.com 2021-06-23 10:10 EDT------- @Florian Weimer: It is not related to missing debug info of glibc, it is installed as you can see: [root@f34 ~]# perf probe -f -x /usr/lib64/libc-2.33.so -a inet_pton Probe point 'inet_pton' not found. Error: Failed to add events. [root@f34 ~]# rpm -qa | fgrep glibc glibc-all-langpacks-2.33-18.fc34.x86_64 glibc-common-2.33-18.fc34.x86_64 glibc-langpack-en-2.33-18.fc34.x86_64 glibc-2.33-18.fc34.x86_64 glibc-doc-2.33-18.fc34.noarch glibc-headers-x86-2.33-18.fc34.noarch glibc-devel-2.33-18.fc34.x86_64 glibc-debugsource-2.33-18.fc34.x86_64 glibc-debuginfo-2.33-18.fc34.x86_64 [root@f34 ~]# The symbol inet_pton is now in the .dynsym section of glibc: [root@f34 ~]# readelf -sW /usr/lib64/libc-2.33.so | egrep '(dynsym|symtab|inet_pton)' Symbol table '.dynsym' contains 2419 entries: 628: 000000000011ea00 108 FUNC WEAK DEFAULT 15 inet_pton@@GLIBC_2.2.5 2251: 000000000011e9b0 76 FUNC GLOBAL DEFAULT 15 __inet_pton_length@@GLIBC_PRIVATE Symbol table '.symtab' contains 104 entries: [root@f34 ~]# Now perf does not find it. In the older version of the library the symbol inet_pton was listed in the .symtab section [root@m35lp76 ~]# rpm -qa | fgrep glibc glibc-common-2.32-4.fc33.s390x glibc-langpack-en-2.32-4.fc33.s390x glibc-2.32-4.fc33.s390x glibc-headers-s390-2.32-4.fc33.noarch glibc-devel-2.32-4.fc33.s390x glibc-debuginfo-common-2.32-4.fc33.s390x glibc-debuginfo-2.32-4.fc33.s390x [root@m35lp76 ~]# readelf -sW /usr/lib64/libc-2.32.so | egrep '(dynsym|symtab|inet_pton)' Symbol table '.dynsym' contains 2604 entries: 668: 00000000001444b0 788 FUNC WEAK DEFAULT 13 inet_pton@@GLIBC_2.2 2418: 00000000001441b0 764 FUNC GLOBAL DEFAULT 13 __inet_pton_length@@GLIBC_PRIVATE Symbol table '.symtab' contains 28858 entries: 20655: 0000000000000000 0 FILE LOCAL DEFAULT ABS inet_pton.c 20656: 00000000001440b0 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c 20657: 00000000001447c4 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c_end 20658: 000000000002ba70 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c.hot 20659: 000000000002ba70 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c_end.hot 20660: 000000000002b938 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c.unlikely 20661: 000000000002b938 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c_end.unlikely 20662: 000000000002ba70 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c.startup 20663: 000000000002ba70 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c_end.startup 20664: 000000000002b968 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c.exit 20665: 000000000002b968 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton.c_end.exit 20666: 00000000001440b0 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton4.start 20667: 00000000001441aa 0 NOTYPE LOCAL HIDDEN 13 .annobin_inet_pton4.end 20668: 00000000001440b0 250 FUNC LOCAL DEFAULT 13 inet_pton4 20669: 00000000001441aa 0 NOTYPE LOCAL HIDDEN 13 .annobin___GI___inet_pton_length.start 20670: 00000000001444ac 0 NOTYPE LOCAL HIDDEN 13 .annobin___GI___inet_pton_length.end 20671: 00000000001444ac 0 NOTYPE LOCAL HIDDEN 13 .annobin___GI___inet_pton.start 20672: 00000000001447c4 0 NOTYPE LOCAL HIDDEN 13 .annobin___GI___inet_pton.end 23591: 00000000001441b0 764 FUNC LOCAL DEFAULT 13 __GI___inet_pton_length 23871: 00000000001444b0 788 FUNC LOCAL DEFAULT 13 __inet_pton 24506: 00000000001444b0 788 FUNC LOCAL DEFAULT 13 __GI___inet_pton 25831: 00000000001444b0 788 FUNC LOCAL DEFAULT 13 __GI_inet_pton 26410: 00000000001441b0 764 FUNC GLOBAL DEFAULT 13 __inet_pton_length 27288: 00000000001444b0 788 FUNC WEAK DEFAULT 13 inet_pton [root@m35lp76 ~]# And perf could find the symbol, extract its address and install a probe on that address. I filed bug 1975895 for considering restoring .symtab in toto. One thing to be aware of with glibc 2.34 is (from the glibc NEWS file): * Previously, glibc installed its various shared objects under versioned file names such as libc-2.33.so. The ABI sonames (e.g., libc.so.6) were provided as symbolic links. Starting with glibc 2.34, the shared objects are installed under their ABI sonames directly, without symbolic links. This increases compatibility with distribution package managers that delete removed files late during the package upgrade or downgrade process. This can impact valgrind suppressions that use obj:*/lib*/libc-2.*so* Those won't match anymore because the file name seen in the process memory map. It will be /usr/lib64/libc.so.6 instead of /usr/lib64/ld-2.34.so. |