Bug 1132459

Summary: Need more correct cleanup behaviour when /var/spool/abrt is full
Product: Red Hat Enterprise Linux 7 Reporter: Konstantin Lepikhov <klepikho>
Component: abrtAssignee: Jakub Filak <jfilak>
Status: CLOSED ERRATA QA Contact: Martin Kyral <mkyral>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: dkochuka, jberan, jbuchta, jfilak, mhabrnal, mkyral
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: abrt-2.1.11-44.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1132510 (view as bug list) Environment:
Last Closed: 2016-11-04 03:05:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1132510, 1203710, 1295829, 1313485    
Attachments:
Description Flags
Patch 1/2: daemon: trigger dump location cleanup after detection
none
Patch 2/2: handle-event: stop creating post-create lock
none
Patch 1/1: daemon: send base names from abrt-server to abrtd none

Description Konstantin Lepikhov 2014-08-21 11:54:30 UTC
Description of problem:
The original description taken from #sfdc case 01176977

----
Aug 19 11:25:22 el1191 abrtd: Directory 'ccpp-2014-08-19-11:25:19-2206' creation detected
Aug 19 11:25:22 el1191 abrtd: Size of '/var/spool/abrt' >= 1000 MB, deleting 'ccpp-2014-08-19-11:24:35-56940'
Aug 19 11:25:22 el1191 abrt[4235]: /var/spool/abrt is 1348804852 bytes (more than 1279MiB), deleting 'ccpp-2014-08-19-11:24:35-56940'
----

I can imagine that when abrtd is removing directories because the size goes over a specific treshold, but at the same time abrt processes are working with files/directories, the abrt scripts fail at different locations depending on specific timing issues.

So I do understand that there is a mechanism to prevent the filesystem to get full, but I think the abrt python scripts should not generated unhandled exceptions causing new crashes. It might be as simple as catching these specific exceptions, and logging the event to syslog before exiting with an error and error message.

All in all that wouldn't be a dramatic change to what is happening on right now, with the added value that these events are being logged and no new crashes are being initiated.

Version-Release number of selected component (if applicable):
abrt-2.0.8-21.el6.x86_64
abrt-addon-ccpp-2.0.8-21.el6.x86_64
abrt-addon-kerneloops-2.0.8-21.el6.x86_64
abrt-addon-python-2.0.8-21.el6.x86_64
abrt-cli-2.0.8-21.el6.x86_64
abrt-libs-2.0.8-21.el6.x86_64
abrt-tui-2.0.8-21.el6.x86_64

How reproducible:
Always, when size of /var/spool/abrt directory is bigger than predefined threshold.

Steps to Reproduce:
1. Create so many crashes that dir size will be bigger than predefined threshold.
2. Try to crash app again.

Actual results:
The abrt process will try to delete actual crashdump directory and run sosreport at the same time.

Expected results:
If we need a cleanup for that directory just delete oldest crasdump, not the latest one.

Comment 3 Jakub Filak 2016-02-19 11:08:24 UTC
I have developed a test reproducing the unwanted behavior and I found out that this bugzilla bug is still not fixed:
https://github.com/abrt/abrt/commit/4837a0a1aeb8c78b4bc4e07326de48daf8e63f8b

Comment 4 Jakub Filak 2016-07-22 11:39:12 UTC
Fixed upstream: https://github.com/abrt/abrt/pull/1164

Comment 5 Matej Habrnal 2016-08-02 10:58:01 UTC
Created attachment 1186746 [details]
Patch 1/2: daemon: trigger dump location cleanup after detection

Comment 6 Matej Habrnal 2016-08-02 10:58:04 UTC
Created attachment 1186747 [details]
Patch 2/2: handle-event: stop creating post-create lock

Comment 8 Jakub Filak 2016-08-23 12:57:59 UTC
*** Bug 1369433 has been marked as a duplicate of this bug. ***

Comment 9 Jakub Filak 2016-08-26 09:38:11 UTC
There is a little problem with the patches. If abrt-server receives a path that is not in canonical form, abrtd fails to remove such a process from its queue if the relevant problem directory is removed due to the size limit restrictions.

Comment 10 Jakub Filak 2016-08-26 11:55:45 UTC
Upstream commit fixes the problem with canonical paths:
https://github.com/abrt/abrt/pull/1176/commits/d0a35daa628652aa83e7b890051b32bd35402ec8

Comment 12 Matej Habrnal 2016-08-29 07:30:09 UTC
Created attachment 1195193 [details]
Patch 1/1: daemon: send  base names from abrt-server to abrtd

Comment 15 errata-xmlrpc 2016-11-04 03:05:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2307.html