Bug 593037

Summary: abrt does not save core dumps of deleted binaries
Product: Red Hat Enterprise Linux 6 Reporter: Marc Milgram <mmilgram>
Component: abrtAssignee: Jiri Moskovcak <jmoskovc>
Status: CLOSED CURRENTRELEASE QA Contact: Michal Nowak <mnowak>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: ahecox, dfediuck, dvlasenk, gavin, jifl-bugzilla, juha.heljoranta, kklic, mnowak, npajkovs, ohudlick
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: abrt-1.1.5-1.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 612838 (view as bug list) Environment:
Last Closed: 2010-11-10 19:33:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 612838    

Description Marc Milgram 2010-05-17 17:09:34 UTC
Description of problem:
evolution-data-server segfaulted, abrt caught the dump, then deleted it thinking that it did not belong to any package.

Version-Release number of selected component (if applicable):
abrt-1.1.1-1.el6.x86_64
evolution-data-server-2.28.3-3.el6.x86_64
prelink-0.4.3-3.el6.x86_64

How reproducible:
Unknown

Steps to Reproduce:
1. Unknown
2.
3.
  
Actual results:
abrt deletes dump automatically before it is reported

Expected results:
abtr doesn't delete dump on own

Additional info:
Here are the associated messages:
May 17 12:49:13 mmilgramd kernel: evolution-data-[20404]: segfault at 45c758 ip 0000003a4de09160 sp 00007f26c506bbc8 error 4 in libpthread-2.12.so[3a4de00000+17000]
May 17 12:49:15 mmilgramd abrt[13783]: saved core dump of pid 20403 (/usr/libexec/evolution-data-server-2.28.#prelink#.i6K34Z (deleted)) to /var/cache/abrt/ccpp-1274114953-20403.new/coredump (135446528 bytes)
May 17 12:49:15 mmilgramd abrtd: Directory 'ccpp-1274114953-20403' creation detected
May 17 12:49:15 mmilgramd abrtd: Executable '/usr/libexec/evolution-data-server-2.28.#prelink#.i6K34Z (deleted)' doesn't belong to any package
May 17 12:49:15 mmilgramd abrtd: Corrupted or bad crash /var/cache/abrt/ccpp-1274114953-20403 (res:4), deleting
May 17 12:56:06 mmilgramd abrtd: Getting crash infos...

I suspect that evolution-data-server started running, later a prelink happened, later still evolution-data-server get a segfault - but abrt doesn't think that evolution-data-server is associated with a package any more.  But, this is an untested hypothesis.

Comment 1 Nikola Pajkovsky 2010-05-17 21:51:32 UTC
Packages aren't signed. You have to edit /etc/abrt/abrt.conf and change OpenGPGCheck = yes to OpenGPGCheck = no

Comment 2 Jonathan Larmour 2010-05-18 20:17:55 UTC
On Fedora 12 I've had a similar issue. In my case NetworkManager seg faulted and the outcome at the end was this in /var/log/messages:

May 18 08:25:17 lert abrtd: Executable '/usr/sbin/NetworkManager (deleted)' doesn't belong to any package
May 18 08:25:17 lert abrtd: Corrupted or bad crash /var/cache/abrt/ccpp-1274167516-1423 (res:4), deleting

The gpg key id (from /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora in my case) does match the package's signature (from rpm -qi NetworkManager).

Looking at the output of 'stat /usr/sbin/NetworkManager' I can see the binary was changed at 2010-05-17 03:39:43, and looking at /var/log/cron shows that is exactly when prelink was running. So the OP's suspicion is correct - it is prelink.

So once prelink runs, abrt becomes ineffective for resident applications as it will reject all modified (but still running) binaries.

Comment 3 Denys Vlasenko 2010-05-20 13:00:46 UTC
Yes, this is true.

However, note that if binary is deleted, it might mean that modified one is installed, and therefore processing the crash will be more difficult now: gdb can't produce a good backtrace from coredump alone, it needs corresponding binary as well.

In other words: just running "core-file COREDUMP" and "bt" usually results in worse backtrace than when one runs "file BINARY", "core-file COREDUMP" and "bt".

I hesitate to fix the bug you mention, because it results in more bugs reported, yes, but these bugs are more likely to have badly formed, uninformative backtraces.

Comment 4 Denys Vlasenko 2010-05-20 13:03:05 UTC
*** Bug 593373 has been marked as a duplicate of this bug. ***

Comment 5 Jonathan Larmour 2010-05-23 21:14:57 UTC
> I hesitate to fix the bug you mention, because it results in more bugs
> reported, yes, but these bugs are more likely to have badly formed,
> uninformative backtraces.    

It's possible this could be the case, but firstly I would expect any genuine change of binary would result in very obviously different/bogus backtraces.

Secondly, it would be entirely appropriate for abrt to put some warning in its report to say that it detected the binary had changed (maybe in a ***BIG OBVIOUS WAY*** :-)). So bug handlers can watch for that.

Finally I don't think the signal to noise ratio will be that bad - the main scenario we're concerned about is if the program image really changes, and in most cases that's likely to be because of an RPM upgrade. But in practice many RPM upgrades affecting long-lived processes are likely to cause those processes to be restarted.

So I think the risks you are worried about are manageable.

Comment 7 RHEL Program Management 2010-06-17 12:53:22 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 10 releng-rhel@redhat.com 2010-11-10 19:33:00 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.