Hide Forgot
Description of problem: When program that is run under valgrind receives SIGTERM to be terminated, valgrind gets stuck. Version-Release number of selected component (if applicable): valgrind-3.15.0-9.fc30.x86_64 How reproducible: Sometimes. It is run in vagrant box as part of SSSD upstream CI. It does not happen everytime, the frequency is quite low. Steps to Reproduce: I do not know. Actual results: CI tests gets stuck, waiting for the valgrind to finish which never happens. Expected results: Valgrind finishes and CI tests continue. Additional info: ps aux | grep valgrind vagrant 21568 0.0 1.5 107872 61980 ? S 08:33 0:02 valgrind --log-file=/tmp/sssd-intg.CZti5UDE/var/log/sssd/valgrind_ifp.log /tmp/sssd-intg.CZti5UDE/libexec/sssd/sssd_ifp --uid 0 --gid 0 --debug-to-files cat /tmp/sssd-intg.CZti5UDE/var/log/sssd/valgrind_ifp.log ==21568== Process terminating with default action of signal 15 (SIGTERM) ==21568== at 0x50C7D58: __unregister_atfork (in /usr/lib64/libc-2.29.so) ==21568== by 0x5080CE8: __cxa_finalize (in /usr/lib64/libc-2.29.so) ==21568== by 0x5BE5BE6: ??? (in /usr/lib64/ldb/modules/ldb/memberof.so) ==21568== by 0x401026A: _dl_fini (in /usr/lib64/ld-2.29.so) ==21568== by 0x508067F: __run_exit_handlers (in /usr/lib64/libc-2.29.so) ==21568== by 0x50807BF: exit (in /usr/lib64/libc-2.29.so) ==21568== by 0x48D6290: orderly_shutdown (server.c:249) ==21568== by 0x4EADFB5: tevent_common_invoke_signal_handler (tevent_signal.c:370) ==21568== by 0x4EAE142: tevent_common_check_signal (tevent_signal.c:468) ==21568== by 0x4EB017D: epoll_event_loop_once (tevent_epoll.c:909) ==21568== by 0x4EAE41A: std_event_loop_once (tevent_standard.c:110) ==21568== by 0x4EA9537: _tevent_loop_once (tevent.c:772) (gdb) bt #0 vgModuleLocal_do_syscall_for_client_WRK () at m_syswrap/syscall-amd64-linux.S:173 #1 0x00000000580a8b70 in do_syscall_for_client (syscall_mask=0x1002cadca8, tst=0x10020084b0, syscallno=202) at m_syswrap/syswrap-main.c:1964 #2 vgPlain_client_syscall (tid=tid@entry=1, trc=trc@entry=73) at m_syswrap/syswrap-main.c:1964 #3 0x00000000580a4e6b in handle_syscall (tid=tid@entry=1, trc=73) at m_scheduler/scheduler.c:1209 #4 0x00000000580a66aa in vgPlain_scheduler (tid=tid@entry=1) at m_scheduler/scheduler.c:1531 #5 0x00000000580ba318 in final_tidyup (tid=tid@entry=1) at m_main.c:2440 #6 0x00000000580ba4bd in shutdown_actions_NORETURN (tid=1, tids_schedretcode=VgSrc_FatalSig) at m_main.c:2129 #7 0x00000000580f6178 in run_a_thread_NORETURN (tidW=1) at m_syswrap/syswrap-linux.c:203 #8 0x0000000000000000 in ?? () (gdb) l 168 restarting it. */ 169 2: syscall 170 3: /* In the range [3, 4), the syscall result is in %rax, 171 but hasn't been committed to RAX. */ 172 173 POP_di_si_dx_cx_8 174 175 movq %rax, OFFSET_amd64_RAX(%rsi) /* save back to RAX */ 176 177 4: /* Re-block signals. If eip is in [4,5), then the syscall
This message is a reminder that Fedora 29 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '29'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 29 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This message is a reminder that Fedora 31 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '31'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 31 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
This still happens sporadically.
I'm pretty sure I am seeing this, and also that I have a test case that (for me) reproduces this 100% of the time. You have to compile nbdkit (https://github.com/libguestfs/nbdkit) from source which is not too difficult, then run this command in the nbdkit directory: $ NBDKIT_VALGRIND=1 ./nbdkit -U - -v -D data.AST=1 data '@4 "\x00"' allocator=malloc --run 'qemu-img convert $uri /tmp/out' valgrind hangs on exit. Since this didn't happen until I upgraded this box from F32, I suspect this might actually be a kernel/glibc problem or bad interaction with valgrind. valgrind-3.16.1-8.fc34.x86_64 kernel 5.8.15-301.fc33.x86_64 glibc-2.32.9000-18.fc34.x86_64
This bug appears to have been reported against 'rawhide' during the Fedora 34 development cycle. Changing version to 34.
(In reply to Richard W.M. Jones from comment #4) > I'm pretty sure I am seeing this, and also that I have a test case that > (for me) reproduces this 100% of the time. > > You have to compile nbdkit (https://github.com/libguestfs/nbdkit) from > source which is not too difficult, then run this command in the nbdkit > directory: > > $ NBDKIT_VALGRIND=1 ./nbdkit -U - -v -D data.AST=1 data '@4 "\x00"' > allocator=malloc --run 'qemu-img convert $uri /tmp/out' > > valgrind hangs on exit. > > Since this didn't happen until I upgraded this box from F32, I suspect > this might actually be a kernel/glibc problem or bad interaction with > valgrind. > > valgrind-3.16.1-8.fc34.x86_64 > kernel 5.8.15-301.fc33.x86_64 > glibc-2.32.9000-18.fc34.x86_64 I think this was actually caused by subtle memory corruption in my tests, because of an incorrect kernel madvise() hint. In any case it no longer happens with the latest test and valgrind-3.17.0-11.fc35.x86_64
FEDORA-2021-07e75edcab has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-07e75edcab
FEDORA-2021-07e75edcab has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-07e75edcab` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-07e75edcab See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2021-07e75edcab has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.