Bug 512121
Summary: | Valgrind --leak-check=full crashes the application | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Milan Crha <mcrha> | ||||
Component: | valgrind | Assignee: | Jakub Jelinek <jakub> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 11 | CC: | drepper, jakub, marcandre.lureau, schwab, selinux | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-08-03 07:18:18 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Milan Crha
2009-07-16 12:56:46 UTC
Of course valgrind massively changes the timings, that's the only thing where using valgrind matters here. In rawhide it crashes even without valgrind. I've instrumented libpthread a little bit and the first failure from pthread_mutex_unlock I get in: failure at 152 owner 0 tid 14832 lock 14832 count 0 nusers 1 ==14832== at 0xA75D388: __pthread_mutex_unlock_full (pthread_mutex_unlock.c:152) ==14832== by 0xA75E357: pthread_cond_wait@@GLIBC_2.3.2 (pthread_cond_wait.S:201) ==14832== by 0x3B5E6427EC: pa_cond_wait (in /usr/lib64/libpulsecommon-0.9.15.so) ==14832== by 0x3B5EA341BF: pa_threaded_mainloop_wait (in /usr/lib64/libpulse.so.0.8.0) ==14832== by 0xB6943CC: pulse_driver_open (in /usr/lib64/libcanberra-0.12/libcanberra-pulse.so) ==14832== by 0x3B62E0BC79: (within /usr/lib64/libcanberra.so.0.1.5) ==14832== by 0x3B62E03307: (within /usr/lib64/libcanberra.so.0.1.5) ==14832== by 0x3B62E03B5B: ca_context_play_full (in /usr/lib64/libcanberra.so.0.1.5) ==14832== by 0x3B63A02514: ca_gtk_play_for_widget (in /usr/lib64/libcanberra-gtk.so.0.0.5) ==14832== by 0xA551538: (within /usr/lib64/gtk-2.0/modules/libcanberra-gtk-module.so) ==14832== by 0x3B5A037ABD: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.2000.4) ==14832== by 0x3B5A03B277: (within /lib64/libglib-2.0.so.0.2000.4) This is a PI recursive mutex. owner, lock, count and nusers are the values of the mutex->__data.__* fields, tid is current thread's tid. *** Bug 513854 has been marked as a duplicate of this bug. *** The problem is that during pthread_mutex_unlock FUTEX_UNLOCK_PI | FUTEX_PRIVATE_FLAG returns ENOSYS and that's actually returned by valgrind: switch(ARG2) { case VKI_FUTEX_WAIT: case VKI_FUTEX_WAIT | VKI_FUTEX_PRIVATE_FLAG: if (ARG4 != 0) PRE_MEM_READ( "futex(timeout)", ARG4, sizeof(struct vki_timespec) ); break; case VKI_FUTEX_REQUEUE: case VKI_FUTEX_REQUEUE | VKI_FUTEX_PRIVATE_FLAG: case VKI_FUTEX_CMP_REQUEUE: case VKI_FUTEX_CMP_REQUEUE | VKI_FUTEX_PRIVATE_FLAG: PRE_MEM_READ( "futex(futex2)", ARG5, sizeof(Int) ); break; case VKI_FUTEX_WAKE: case VKI_FUTEX_WAKE | VKI_FUTEX_PRIVATE_FLAG: case VKI_FUTEX_FD: /* no additional pointers */ break; default: SET_STATUS_Failure( VKI_ENOSYS ); // some futex function we don't understand break; Now, in F12 when not under valgrind, I wonder if it is a similar case where that syscall fails. Can anyone try to strace it? Created attachment 355413 [details]
Ouput of "strace -o strace-audacity.txt audacity"
After updating gcc and glibc packages:
Updated:
gcc.x86_64 0:4.4.1-3 glibc.x86_64 0:2.10.90-10 libgcj.x86_64 0:4.4.1-3
Dependency Updated:
cpp.x86_64 0:4.4.1-3 gcc-c++.x86_64 0:4.4.1-3
gcc-gfortran.x86_64 0:4.4.1-3 glibc-common.x86_64 0:2.10.90-10
glibc-devel.x86_64 0:2.10.90-10 glibc-headers.x86_64 0:2.10.90-10
libgcc.x86_64 0:4.4.1-3 libgfortran.x86_64 0:4.4.1-3
libgomp.x86_64 0:4.4.1-3 libstdc++.x86_64 0:4.4.1-3
libstdc++-devel.x86_64 0:4.4.1-3
Complete!
[root@tlondon ~]#
I get this strace/crash running audacity (also get crash running rhythmbox):
[tbl@tlondon ~]$ strace -o audacity-strace.txt audacity
Assertion 'pthread_mutex_unlock(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:108, function pa_mutex_unlock(). Aborting.
ptrace: Operation not permitted.
/home/tbl/3336: No such file or directory.
No stack.
[tbl@tlondon ~]$
Could this be useful:
open("/var/lib/dbus/machine-id", O_RDONLY) = 20
fstat(20, {st_mode=S_IFREG|0644, st_size=33, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f74fcb95000
read(20, "1e4008bf6214497396dedf114a4d23e9\n"..., 4096) = 33
close(20) = 0
munmap(0x7f74fcb95000, 4096) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 20
fcntl(20, F_GETFD) = 0
fcntl(20, F_SETFD, FD_CLOEXEC) = 0
setsockopt(20, SOL_SOCKET, SO_PRIORITY, [6], 4) = 0
fcntl(20, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(20, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(20, {sa_family=AF_FILE, path="/home/tbl/.pulse/1e4008bf6214497396dedf114a4d23e9:runtime/native"...}, 110) = 0
futex(0x2f1b6e0, FUTEX_UNLOCK_PI_PRIVATE, 0) = 0
futex(0x2e13724, 0x8b /* FUTEX_??? */, 1) = 0
write(2, "Assertion 'pthread_mutex_unlock(&"..., 126) = 126
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
tgkill(3336, 3336, SIGABRT) = 0
--- SIGABRT (Aborted) @ 0 (0) ---
I've checked in upstream a patch for this. Untested, since I don't have such a new kernel which actually works with my machines. The valgrind bug is fixed in rawhide valgrind packages, and so is the glibc bug fixed in rawhide glibc packages. F11 had been "just released", can that be ported there too, please? Because not having basic development tools available in Fedora is not the best thing, at least from my point of view. I'm not going to install broken rawhide, because I'm supposed to develop for other application, and even I do not understand what's the problem with "backporting" patches for this to F11, then I'm willing to compile my own local *upstream* versions of valgrind/glib, to be able to continue with tools I need for my work. What are the exact upstream versions/commits where this bug had been fixed, please? obviously, other people and distributions might be interested, so I would also appreciate those pointers. thanks! |