Bug 1253908 - retrace-server cleanup job should not remove tasks with open vmcores
retrace-server cleanup job should not remove tasks with open vmcores
Status: NEW
Product: Fedora EPEL
Classification: Fedora
Component: retrace-server (Show other bugs)
epel7
Unspecified Unspecified
unspecified Severity low
: ---
: ---
Assigned To: abrt
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-15 07:25 EDT by Dave Wysochanski
Modified: 2018-02-05 13:47 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dave Wysochanski 2015-08-15 07:25:46 EDT
Description of problem:
Today retrace server just checks the mtime of the task directory and removes the task if it is too old (a failed task is removed with different timeframe than a success task).  We need another check to make sure the vmcore (crash/vmcore) is not open before removal.  Any instance of this occurring may be a case where people have forgotten to close their crash session, but could also be due to a 'failed task' but somehow the vmcore is usable.  The latter is a real possibility, and we have at least one open bug about it today -  https://bugzilla.redhat.com/show_bug.cgi?id=1149356).  In any case, retrace should not remove tasks if it's associated with an open vmcore.

Version-Release number of selected component (if applicable):
retrace-server-1.12-3.el6.noarch

How reproducible:
Easily reproducible but in practice on our system I've only seen it a couple times (likely due to our DeleteFailedTaskAfter or DeleteTaskAfter settings), and I can't recall anyone complaining recently though some have brought it up.

Steps to Reproduce:
1. Run 'retrace-server-worker <task> crash' on a vmcore and leave it open for longer than the delete rules (either DeleteFailedTaskAfter or DeleteTaskAfter)

Actual results:
task is removed even though someone has the 'crash/vmcore' file open

Expected results:
task is not removed if someone has the 'crash/vmcore' file open

Additional info:
Documented, but low priority for us due to fairly large value of DeleteTaskAfter (we have 6 mo right now), and low incidence.  If we ever lower the values of DeleteFailedTaskAfter or DeleteTaskAfter, or there's a higher rate of 'failed tasks' which are still useful, it may become more of an issue.

Note You need to log in before you can comment on or make changes to this bug.