Hide Forgot
Created attachment 362662 [details] Proposed ABRT fix. Description of problem: There were complaints (from mcepl) ABRT sometimes produces bogus backtraces. This can happen in the common case of running program being older version than its on-disk files already upgraded by YUM while it was running. There are two problems: * ABRT calls GDB with explicit `file' command and thus incorrectly overriding its automatic lookup of matching executable binary (this Bug). * GDB could load a library matching just by its name, not by its build-id stored in the core file (fixed now for GDB in Rawhide as Bug 524572). Version-Release number of selected component (if applicable): abrt-0.0.9-2.fc12 How reproducible: just checked the sources Steps to Reproduce: wget http://kojipkgs.fedoraproject.org/packages/coreutils/7.6/{4,5}.fc12/x86_64/coreutils-{,libs-,debuginfo-}7.6-{4,5}.fc12.x86_64.rpm rm -f core.*; rpm -U --oldpackage *7.6-4*; (ulimit -c unlimited; sleep 1h& p=$!; sleep 1; rpm -U *7.6-5*; kill -SEGV $p) Actual results: echo CURRENT:; gdb -q -ex bt -ex q sleep ./core.* CURRENT: Reading symbols from /bin/sleep...Reading symbols from /usr/lib/debug/bin/sleep.debug...done. done. warning: Can't read pathname for load map: Input/output error. Reading symbols from /lib64/librt-2.10.90.so...Reading symbols from /usr/lib/debug/lib64/librt-2.10.90.so.debug...done. done. Loaded symbols for /lib64/librt-2.10.90.so Reading symbols from /lib64/libc-2.10.90.so...Reading symbols from /usr/lib/debug/lib64/libc-2.10.90.so.debug...done. done. Loaded symbols for /lib64/libc-2.10.90.so Reading symbols from /lib64/libpthread-2.10.90.so...Reading symbols from /usr/lib/debug/lib64/libpthread-2.10.90.so.debug...done. done. Loaded symbols for /lib64/libpthread-2.10.90.so Reading symbols from /lib64/ld-2.10.90.so...Reading symbols from /usr/lib/debug/lib64/ld-2.10.90.so.debug...done. done. Loaded symbols for /lib64/ld-2.10.90.so Core was generated by `sleep 1h'. Program terminated with signal 11, Segmentation fault. #0 0x00000038500a7380 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:82 82 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) #0 0x00000038500a7380 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:82 #1 0x0000000000403d9b in rpl_nanosleep (requested_delay=0x7fff3813c8c0, remaining_delay=0x0) at nanosleep.c:69 #2 0x000000000040343b in xnanosleep (seconds=<value optimized out>) at xnanosleep.c:112 #3 0x00000000004016fc in main (argc=2, argv=<value optimized out>) at sleep.c:147 Current language: auto The current source language is "auto; currently asm". Expected results: echo REQUESTED:; gdb -q -ex bt -ex q -c ./core.* REQUESTED: Missing separate debuginfo for the main executable file Try: yum --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/06/b0a4f8365ef2efae035d4f83eb44befb60d543 Core was generated by `sleep 1h'. Program terminated with signal 11, Segmentation fault. #0 0x00000038500a7380 in ?? () #0 0x00000038500a7380 in ?? () #1 0x0000000000403d9b in ?? () #2 0x0010000000000000 in ?? () #3 0x000000384fc14bc5 in ?? () #4 0x000000000006bae8 in ?? () #5 0x0000000031090c07 in ?? () #6 0x0000000000000e10 in ?? () #7 0x0000000000000000 in ?? () Additional info: In this case the "Actual results" dump is in fact correct as the differences between coreutils-7.6-{4,5}.fc12.x86_64 are not visible in this case. But in general case the "Actual results" dump can be completely bogus. This difference in the patch sure currently has no effect: - fTmp << "core " << pDebugDumpDir << "/"FILENAME_COREDUMP"\n"; + fTmp << "core-file " << pDebugDumpDir << "/"FILENAME_COREDUMP"\n"; but one should not rely on the shortened command names as sometimes they change.
Applied to git: [master f0e609b] Jan Kratochvil's fix (#525721): use core _only_, not executable image for backtrace 1 files changed, 9 insertions(+), 4 deletions(-)
Just when thinking about it I think this patch could have been kept as a vendor one (Fedora only). But I do not know how ABRT considers its upstream vs. Fedora downstream. FSF GDB currently still does not support locating the binary from core-file (according to its build-id or in any other way). The locating-by-build-id support is still being kept as a vendor (Fedora) GDB patch. With this patch the upstream ABRT will not work with upstream GDB.
Ok, I'm reopening it till it's clear whether this is a good idea or what to do otherwise.
BTW I find it a must for Fedora. Just that possible use of ABRT in other distros may be broken by this patch.
Please do not push this change into Fedora for now, it has a regression described at: https://fedorahosted.org/pipermail/crash-catcher/2009-October/000052.html Sorry about that.
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
abrt-1.0.0-1.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/abrt-1.0.0-1.fc12
abrt-1.0.0-1.fc12 has been pushed to the Fedora 12 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update abrt'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2009-12098
abrt-1.0.0-1.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.