Description of problem: Some functions (e.g. dwfl_module_getelf) return DWFL_E_CB even though the related library seems to be alright. It can be reproduced by running eu-stack on the provided coredump. Version-Release number of selected component (if applicable): elfutils-0.158-4.fc20.x86_64 How reproducible: Hopefully always. Steps to Reproduce: 1. Obtain F20 system. 2. # yum install koji mock 3. # usermod -a -G mock user 4. # su - user 5. $ wget http://.../ccpp-callback-failure.tar.xz 6. $ wget http://.../setup-retracing.py (I'll provide the actual locations in next comment) 7. $ tar xJvf ccpp-callback-failure.tar.xz 8. $ ./setup-retracing.py ccpp-callback-failure/ (This will download the RPMs that contain libraries referenced by the core dump from koji and unpack them to fedora-20-x86_64 mock chroot.) 9. $ mock -r fedora-20-x86_64 --copyin ccpp-callback-failure /tmp/ccpp-callback-failure 10. $ mock -r fedora-20-x86_64 --shell 11. <mock># cd /tmp/ccpp-callback-failure && eu-stack --core coredump Actual results: PID 1664 - core (...) TID 1700: #0 0x00000031df00bd20 pthread_cond_wait@@GLIBC_2.3.2 #1 0x00000031f9c236b0 PR_WaitCondVar #2 0x00000031e65470de eu-stack: dwfl_thread_getframes tid 1700 at 0x31e65470dd in libmozjs-17.0.so: Callback returned failure TID 1699: #0 0x00000031df00bd20 pthread_cond_wait@@GLIBC_2.3.2 #1 0x00000031f9c236b0 PR_WaitCondVar #2 0x00000031e64b0700 eu-stack: dwfl_thread_getframes tid 1699 at 0x31e64b06ff in libmozjs-17.0.so: Callback returned failure (...) Expected results: Complete stack traces for the affected thread. Additional info: Running GDB on the core seems to produce complete stack trace. libmozjs-17.0.so is present and with the same build id as in the core. It's likely that the bug is not related to unwinding as we get the same error when calling dwfl_module_getelf.
Created attachment 912508 [details] Do not rely on link_map.l_addr. Problem 1: $ eu-stack --core /tmp/ccpp-callback-failure/coredump #18 0x00007fc661daece8 meta_run #19 0x0000000000402131 eu-stack: dwfl_thread_getframes tid 1664 at 0x402130 in [exe]: Callback returned failure Problem 1 is in how ABRT arranges the retracing chroot and/or how ABRT calls eu-stack. This is not a bug in elfutils. Solution 1.A: $ eu-stack -e /usr/bin/gnome-shell --core /tmp/ccpp-callback-failure/coredump [...] #18 0x00007fc661daece8 meta_run #19 0x0000000000402131 main [...] Solution 1.B: $ gdb /tmp/ccpp-callback-failure/coredump Missing separate debuginfo for the main executable file Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/a3/93a342ffc386364d06e61da0b06c1e1f972eb4 $ yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/a3/93a342ffc386364d06e61da0b06c1e1f972eb4 $ eu-stack --core /tmp/ccpp-callback-failure/coredump #18 0x00007fc661daece8 meta_run #19 0x0000000000402131 main Problem 2: TID 1700: #0 0x00000031df00bd20 pthread_cond_wait@@GLIBC_2.3.2 #1 0x00000031f9c236b0 PR_WaitCondVar #2 0x00000031e65470de eu-stack: dwfl_thread_getframes tid 1700 at 0x31e65470dd in libmozjs-17.0.so: Callback returned failure TID 1699: #0 0x00000031df00bd20 pthread_cond_wait@@GLIBC_2.3.2 #1 0x00000031f9c236b0 PR_WaitCondVar #2 0x00000031e64b0700 eu-stack: dwfl_thread_getframes tid 1699 at 0x31e64b06ff in libmozjs-17.0.so: Callback returned failure Solution: Attached l_addr.patch. It needs a testcase and then I will post it to the elfutils mailing list. The problem was that user dumping the core file had DSOs prelinked but downloaded DSOs from RPMs are not prelinked - sure an elfutils bug (in my code although not in the unwinder but in DSOs r_debug reader). TID 1700: #0 0x00000031df00bd20 pthread_cond_wait@@GLIBC_2.3.2 #1 0x00000031f9c236b0 PR_WaitCondVar #2 0x00000031e65470de js::SourceCompressorThread::threadLoop() [...] TID 1699: #0 0x00000031df00bd20 pthread_cond_wait@@GLIBC_2.3.2 #1 0x00000031f9c236b0 PR_WaitCondVar #2 0x00000031e64b0700 js::GCHelperThread::threadLoop() [...]
Problem 2 fix is now checked in upstream: 475849fdb25265706772905b856cd7028c566a71 The NEEDINFO: Jakub Filak has asked for a new rpm release with this fix, IIUC for all Fedoras.
I backported the fix to rawhide elfutils-0.159-8.fc22. If that works for you we'll can push that version to older fedora releases too.
elfutils 0.160 has been released and included in fedora f21/rawhide. As said in comment #4. Let us know if that works for you then we can see whether to backport it to earlier fedora releases. Thanks.
Verified that 0.160 works fine - thanks for the quick fix! Backport to earlier versions will be much appreciated.
elfutils-0.160-1.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/elfutils-0.160-1.fc20
elfutils-0.160-1.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/elfutils-0.160-1.fc19
elfutils-0.160-1.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
elfutils-0.160-1.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.